Skip Navigation
Lightweight region memory management in a two-stage language
gist.github.com Lightweight region memory management in a two-stage language

Lightweight region memory management in a two-stage language - TwoStageRegion.md

Lightweight region memory management in a two-stage language

This presents a method to reduce the overhead of the garbage collector, in a language with multi-stage programming (specifically two-level type theory) using regions.

0
Linearizability! Refinement! Prophecy! (proving code correctness)
surfingcomplexity.blog Linearizability! Refinement! Prophecy!

Back in August, Murat Derimbas published a blog post about the paper by Herlihy and Wing that first introduced the concept of linearizability. When we move from sequential programs to concurrent on…

Linearizability! Refinement! Prophecy!

Introduction:

> Back in August, Murat Derimbas published a blog post about the paper by Herlihy and Wing that first introduced the concept of linearizability. When we move from sequential programs to concurrent ones, we need to extend our concept of what "correct" means to account for the fact that operations from different threads can overlap in time. Linearizability is the strongest consistency model for single-object systems, which means that it's the one that aligns closest to our intuitions. Other models are weaker and, hence, will permit anomalies that violate human intuition about how systems should behave. > > Beyond introducing linearizability, one of the things that Herlihy and Wing do in this paper is provide an implementation of a linearizable queue whose correctness cannot be demonstrated using an approach known as refinement mapping. At the time the paper was published, it was believed that it was always possible to use refinement mapping to prove that one specification implemented another, and this paper motivated Leslie Lamport and Martín Abadi to propose the concept of prophecy variables. > > I have long been fascinated by the concept of prophecy variables, but when I learned about them, I still couldn't figure out how to use them to prove that the queue implementation in the Herlihy and Wing paper is linearizable. (I even asked Leslie Lamport about it at the 2021 TLA+ conference). > > Recently, Lamport published a book called The Science of Concurrent Programs that describes in detail how to use prophecy variables to do the refinement mapping for the queue in the Herlihy and Wing paper. Because the best way to learn something is to explain it, I wanted to write a blog post about this. > > In this post, I'm going to assume that readers have no prior knowledge about TLA+ or linearizability. What I want to do here is provide the reader with some intuition about the important concepts, enough to interest people to read further. There's a lot of conceptual ground to cover: to understand prophecy variables and why they're needed for the queue implementation in the Herlihy and Wing paper requires an understanding of refinement mapping. Understanding refinement mapping requires understanding the state-machine model that TLA+ uses for modeling programs and systems. We'll also need to cover what linearizability actually is. > > We'll going to start all of the way at the beginning: describing what it is that a program should do.

0
An expression parser (small DSL) written in C
github.com GitHub - torrentg/expr: An expression parser supporting multiple types

An expression parser supporting multiple types. Contribute to torrentg/expr development by creating an account on GitHub.

GitHub - torrentg/expr: An expression parser supporting multiple types

> ### Key Features > > - Multiple types (number, bool, datetime, string and error) > - Memory managed by user (no allocs) > - Iterator based interface > - Supporting variables > - Stateless > - Expressions can be compiled (RPN stack) > - Fully compile-time checked syntax > - Documented grammar > - Standard C11 code > - No dependencies > > ### Examples > > c > # Numerical calculations > sin((-1 + 2) * PI) > > # Dates > datetrunc(now(), "day") > > # Strings > "hi " + upper("bob") + trim(" ! ") > > # Conditionals > ifelse(1 < 5 && length($alphabet) > 25, "case1", "case2") > > # Find the missing letter > replace($alphabet, substr($alphabet, 25 - random(0, length($alphabet)), 1), "") >

0
Hy: A LISP dialect embedded in Python

From homepage:

> Hy (or "Hylang" for long) is a multi-paradigm general-purpose programming language in the Lisp family. It's implemented as a kind of alternative syntax for Python. Compared to Python, Hy offers a variety of new features, generalizations, and syntactic simplifications, as would be expected of a Lisp. Compared to other Lisps, Hy provides direct access to Python's built-ins and third-party Python libraries, while allowing you to freely mix imperative, functional, and object-oriented styles of programming. (More on "Why Hy?")

Some examples on the homepage:

> Hy: > > hy > (defmacro do-while [test #* body] > `(do > ~@body > (while ~test > ~@body))) > > (setv x 0) > (do-while x > (print "Printed once.")) > > > Python: > > python > x = 0 > print("Printed once.") > while x: > print("Printed once.") >

Interestingly programming.dev's Markdown renderer highlights \\\hy code blocks. Maybe it knows the language ([highlight.js has it](https://github.com/highlightjs/highlight.js/blob/main/SUPPORTED_LANGUAGES.md)). Maybe it's using [Hybris](https://it-m-wikipedia-org.translate.goog/wiki/Hybris_(linguaggio_di_programmazione)?_x_tr_sl=it&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=wapp) (another language that could get its own post, one of its extensions is *.hy`).

GitHub

Online REPL

1.0 announcement

0
Tiny Great Languages: Languages in under 75 lines of code (blog series + GitHub)
zserge.com Tiny Great Languages: Assembly

Summing up years of building interpreters and compilers for various programming languages. The first chapter is about assembly language. We will try to implement a tiny two-pass assembler for CPython VM.

GitHub (source code for all languages), also linked above.

The GitHub says "50 lines of code" but the largest example is 74 lines excluding whitespace and comments.

1
Dune Shell: bash + lisp
adam-mcdaniel.github.io Dune Shell

A shell🐚 by the beach🏖️!

Dune Shell

> Dune is a shell designed for powerful scripting. Think of it as an unholy combination of bash and Lisp. > > You can do all the normal shell operations like piping, file redirection, and running programs. But, you also have access to a standard library and functional programming abstractions for various programming and sysadmin tasks! > > !screenshot

11
Fennel: a Lua-like LISP

> Fennel is a programming language that brings together the simplicity, speed, and reach of Lua with the flexibility of a lisp syntax and macro system. > > - Full Lua compatibility: Easily call any Lua function or library from Fennel and vice-versa. > - Zero overhead: Compiled code should be just as efficient as hand-written Lua. > - Compile-time macros: Ship compiled code with no runtime dependency on Fennel. > - Embeddable: Fennel is a one-file library as well as an executable. Embed it in other programs to support runtime extensibility and interactive development. > > Anywhere you can run Lua code, you can run Fennel code.

Example:

> lisp > ;; Sample: read the state of the keyboard and move the player accordingly > (local dirs {:up [0 -1] :down [0 1] :left [-1 0] :right [1 0]}) > > (each [key [dx dy] (pairs dirs)] > (when (love.keyboard.isDown key) > (let [[px py] player > x (+ px (* dx player.speed dt)) > y (+ py (* dy player.speed dt))] > (world:move player x y)))) >

1
Snapshottable Stores (paper)
dl.acm.org Snapshottable Stores | Proceedings of the ACM on Programming Languages

We say that an imperative data structure is snapshottable or supports snapshots if we can efficiently capture its current state, and restore a previously captured state to become the current state again. This is useful, for example, to implement ...

Snapshottable Stores | Proceedings of the ACM on Programming Languages

Abstract:

> We say that an imperative data structure is snapshottable or supports snapshots if we can efficiently capture its current state, and restore a previously captured state to become the current state again. This is useful, for example, to implement backtracking search processes that update the data structure during search. > > Inspired by a data structure proposed in 1978 by Baker, we present a snapshottable store, a bag of mutable references that supports snapshots. Instead of capturing and restoring an array, we can capture an arbitrary set of references (of any type) and restore all of them at once. This snapshottable store can be used as a building block to support snapshots for arbitrary data structures, by simply replacing all mutable references in the data structure by our store references. We present use-cases of a snapshottable store when implementing type-checkers and automated theorem provers. > > Our implementation is designed to provide a very low overhead over normal references, in the common case where the capture/restore operations are infrequent. Read and write in store references are essentially as fast as in plain references in most situations, thanks to a key optimisation we call record elision. In comparison, the common approach of replacing references by integer indices into a persistent map incurs a logarithmic overhead on reads and writes, and sophisticated algorithms typically impose much larger constant factors. > > The implementation, which is inspired by Baker's and the OCaml implementation of persistent arrays by Conchon and Filliâtre, is both fairly short and very hard to understand: it relies on shared mutable state in subtle ways. We provide a mechanized proof of correctness of its core using the Iris framework for the Coq proof assistant.

0
Rɐbbit: Reactive Data Board & Visual Flow Platform (node-based VPE)
github.com GitHub - ryrobes/rvbbit: Reactive Data Board & Visual Flow Platform

Reactive Data Board & Visual Flow Platform. Contribute to ryrobes/rvbbit development by creating an account on GitHub.

GitHub - ryrobes/rvbbit: Reactive Data Board & Visual Flow Platform

Back from break, I'm going to start posting regularly again.

This is an interesting node-based visual programming environment with very cool graphics. It's written on Clojure but the various "cards" (nodes) can include SQL, R, DALL-E API calls, or anything else.

> !RVBBIT

0
graydon/rust-prehistory: historical archive of rust pre-publication development
github.com GitHub - graydon/rust-prehistory: historical archive of rust pre-publication development

historical archive of rust pre-publication development - graydon/rust-prehistory

GitHub - graydon/rust-prehistory: historical archive of rust pre-publication development

Excerpt:

> This is a reconstruction -- extracted and very lightly edited -- of the "prehistory" of Rust development, when it was a personal project between 2006-2009 and, after late 2009, a Mozilla project conducted in private. > > The purposes of publishing this now are: > > - It might encourage someone with a hobby project to persevere > - It might be of interest to language or CS researchers someday > - It might settle some arguments / disputed historical claims

Rust started being developed 18 years ago. This is how it looked until 14 years ago, where I believe the rest of development is on rust-lang/rust. The first Rust program looks completely different than the Rust we know today.

1
Firedancer: DSL for defining bullet-hell patterns (demo)
firedancer-lang.com Firedancer

Haxe-based language for defining 2D shmups bullet-hell patterns.

Firedancer

GitHub

> Haxe-based language for defining 2D shmups bullet-hell patterns.

The VM also runs in Haxe.

1
Soundly Handling Linearity (blog post + paper)
blog.sigplan.org Soundly Handling Linearity

Programming languages can statically ensure that certain resources are used exactly once through a linear type system based on Girard’s linear logic. Typical linear type systems track the num…

Soundly Handling Linearity

> Soundly handling linearity requires special care in the presence of effect handlers, as the programmer may inadvertently compromise the integrity of a linear resource. For instance, duplicating a continuation that closes over a resource can lead to the internal state of the resource being corrupted or discarding the continuation can lead to resource leakage. Thus a naïve combination of linear resources and effect handlers yields an unsound system.

...

> In the remainder of this blog post we describe a novel approach to rule out such soundness bugs by tracking control-flow linearity, a means to statically assure how often a continuation may be invoked which mediates between linear resources and effectful operations in order to ensure that effect handlers cannot violate linearity constraints on resources. We focus on our implementation in Links. The full technical details are available in our open access POPL'24 distinguished paper Soundly Handling Linearity.

0
Syndicated Actors

> Programming models like the Actor model and the Tuplespace model make great strides toward simplifying programs that communicate. However, a few key difficulties remain. > > The Syndicated Actor model addresses these difficulties. It is closely related to both Actors and Tuplespaces, but builds on a different underlying primitive: eventually-consistent replication of state among actors. Its design also draws on widely deployed but informal ideas like publish/subscribe messaging.

For reference, actors and tuple-spaces are means to implement concurrent programs. Actors are essentially tiny programs/processes that send (push) messages to each other, while a tuple-space is a shared repository of data ("tuples") that can be accessed (pulled) by different processes (e.g. actors).

...

> A handful of Domain-Specific Language (DSL) constructs, together dubbed Syndicate, expose the primitives of the Syndicated Actor model, the features of dataspaces, and the concepts of conversational concurrency to the programmer in an ergonomic way.

...

> To give some of the flavour of working with Syndicate DSL constructs, here's a program written in JavaScript extended with Syndicate constructs: > > javascript > function chat(initialNickname, sharedDataspace, stdin) { > spawn 'chat-client' { > field nickName = initialNickname; > > at sharedDataspace assert Present(this.nickname); > during sharedDataspace asserted Present($who) { > on start console.log(`${who} arrived`); > on stop console.log(`${who} left`); > on sharedDataspace message Says(who, $what) { > console.log(`${who}: ${what}`); > } > } > > on stdin message Line($text) { > if (text.startsWith('/nick ')) { > this.nickname = text.slice(6); > } else { > send sharedDataspace message Says(this.nickname, text); > } > } > } > } >

Documentation

Comparison with other programming models

History

Author's thesis

0
You should make a new programming language

Even though it's very unlikely to become popular (and if so, it will probably take a while), there's a lot you learn from creating a programming language that applies to other areas of software development. Plus, it's fun!

10
First impressions of Gleam: lots of joys and some rough edges

The blog post is the author's impressions of Gleam after it released version 1.4.0. Gleam is an upcoming language that is getting a lot of highly-ranked articles.

It runs on the Erlang virtual machine (BEAM), making it great for distributed programs and a competitor to Elixir and Erlang (the language). It also compiles to JavaScript, making it a competitor to TypeScript.

But unlike Elixir, Erlang, and TypeScript, it's strongly typed (not just gradually typed). It has "functional" concepts like algebraic data types, immutable values, and first-class functions. The syntax is modeled after Rust and its tutorial is modeled after Go's. Lastly, it has a very large community.

2
Zyme: an evolvable language for genetic programming
zyme.dev Zyme - an evolvable language

an esoteric language for genetic programming.

> Zyme is an esoteric language for genetic programming: creating computer programs by means of natural selection. > > For successful evolution mutations must generate a wide range of phenotypic variation, a feat nearly impossible when randomly modifying the code of conventional programming languages. Zyme is designed to maximize the likelihood of a bytecode mutation creating a novel yet non-fatal change in program behavior. > > Diverging from conventional register or stack-based architectures, Zyme uses a unique molecular automaton-based virtual machine, mimicking an abstract cellular metabolism. This design enables fuzzy control flow and precludes invalid runtime states, transforming potential crashes into opportunities for adaptation.

Very unique, even for an esoteric language. Imagine a program that gets put through natural selection and "evolves" like a species: the program is cloned many times, each clone is slightly mutated, the clones that don't perform as well on some metric are discarded, and the process is repeated; until eventually you have programs that do great on the metric, that you didn't write.

0
Geometry Bugs and Geometry Types
www.cs.cornell.edu Geometry Bugs and Geometry Types

A special kind of bug exists in code that has to deal with geometric concepts like positions, directions, coordinate systems, and all that. Long ago, in an OOPSLA 2020 paper, we defined geometry bugs and designed a type system to catch them. This post demonstrates the idea through some buggy GLSL sh...

Geometry Bugs and Geometry Types

Key excerpt:

> At OOPSLA 2020, Prof. Dietrich Geisler published a paper about geometry bugs and a type system that can catch them. The idea hasn't exactly taken over the world, and I wish it would. The paper's core insight is that, to do a good job with this kind of type system, you need your types to encode three pieces of information: > > - the reference frame (like model, world, or view space) > - the coordinate scheme (like Cartesian, homogeneous, or polar coordinates) > - the geometric object (like positions and directions) > > In Dietrich's language, these types are spelled scheme<frame>.object. Dietrich implemented these types in a language called Gator with help from Irene Yoon, Aditi Kabra, Horace He, and Yinnon Sanders. With a few helper functions, you can get Gator to help you catch all the geometric pitfalls we saw in this post.

0
Interval Parsing Grammars for File Format Parsing (paper)
dl.acm.org Interval Parsing Grammars for File Format Parsing | Proceedings of the ACM on Programming Languages

File formats specify how data is encoded for persistent storage. They cannot be formalized as context-free grammars since their specifications include context-sensitive patterns such as the random access pattern and the type-length-value pattern. We ...

Interval Parsing Grammars for File Format Parsing | Proceedings of the ACM on Programming Languages

Abstract:

> File formats specify how data is encoded for persistent storage. They cannot be formalized as context-free grammars since their specifications include context-sensitive patterns such as the random access pattern and the type-length-value pattern. We propose a new grammar mechanism called Interval Parsing Grammars IPGs) for file format specifications. An IPG attaches to every nonterminal/terminal an interval, which specifies the range of input the nonterminal/terminal consumes. By connecting intervals and attributes, the context-sensitive patterns in file formats can be well handled. In this paper, we formalize IPGs' syntax as well as its semantics, and its semantics naturally leads to a parser generator that generates a recursive-descent parser from an IPG. In general, IPGs are declarative, modular, and enable termination checking. We have used IPGs to specify a number of file formats including ZIP, ELF, GIF, PE, and part of PDF; we have also evaluated the performance of the generated parsers.

0
A real chicken-and-egg situation
  • C++’s mascot is an obese sick rat with a missing foot*, because it has 1000+ line compiler errors (the stress makes you overeat and damages your immune system) and footguns.

    EDIT: Source (I didn't make up the C++ part)

  • Methods Should Be Object Safe
  • I could understand method = associated function whose first parameter is named self, so it can be called like self.foo(…). This would mean functions like Vec::new aren’t methods. But the author’s requirement also excludes functions that take generic arguments like Extend::extend.

    However, even the above definition gives old terminology new meaning. In traditionally OOP languages, all functions in a class are considered methods, those only callable from an instance are “instance methods”, while the others are “static methods”. So translating OOP terminology into Rust, all associated functions are still considered methods, and those with/without method call syntax are instance/static methods.

    Unfortunately I think that some people misuse “method” to only refer to “instance method”, even in the OOP languages, so to be 100% unambiguous the terms have to be:

    • Associated function: function in an impl block.
    • Static method: associated function whose first argument isn’t self (even if it takes Self under a different name, like Box::leak).
    • Instance method: associated function whose first argument is self, so it can be called like self.foo(…).
    • Object-safe method: a method callable from a trait object.
  • Removed
    Writing an EBNF -> Scheme parser generator using AI for machine-unfriendly descriptions, and further enhancements?
  • I find writing the parser by hand (recursive descent) to be easiest. Sometimes I use a lexer generator, or if there isn’t a good one (idk for Scheme), write the lexer by hand as well. Define a few helper functions and macros to remove most of the boilerplate (you really benefit from Scheme here), and you almost end up writing the rules out directly.

    Yes, you need to manually implement choice and figure out what/when to lookahead. Yes, the final parser won’t be as readable as a BNF specification. But I find writing a hand-rolled parser generator for most grammars, even with the manual choice and lookahead, is surprisingly easy and fast.

    The problem with parser generators is that, when they work they work well, but when they don’t work (fail to generate, the generated parser tries to parse the wrong node, the generated parser is very inefficient) it can be really hard to figure out why. A hand-rolled parser is much easier to debug, so when your grammar inevitably has problems, it ends up taking less time in total to go from spec to working hand-rolled vs. spec to working parser-generator-generated.

    The hand-rolled rules may look something like (with user-defined macros and functions define-parse, parse, peek, next, and some simple rules like con-id and name-id as individual tokens):

    ; pattern			::= [ con-id ] factor "begin" expr-list "end"
    (define-parse pattern
      (mk-pattern
        (if (con-id? (peek)) (next))
        (parse factor)
        (do (parse “begin”) (parse expr-list) (parse “end”))))
    
    ; factor 			::= name-id 
    ; 			 | symbol-literal
    ; 			 | long-name-id 
    ; 			 | numeric-literal 
    ;	 		 | text-literal 
    ; 			 | list-literal 
    ; 			 | function-lambda 
    ; 			 | tacit-arg
    ; 			 | '(' expr ')' 
    (define-parse factor
      (case (peek)
        [name-id? (if (= “.” (peek2)) (mk-long-name-id …) (next))]
        [literal? (next)]
        …))
    

    Since you’re using Scheme, you can almost certainly optimize the above to reduce even more boilerplate.

    Regarding LLMs: if you start to write the parser with the EBNF comments above each rule like above, you can paste the EBNF in and Copilot will generate rules for you. Alternatively, you can feed a couple EBNF/code examples to ChatGPT and ask it to generate more.

    In both cases the AI will probably make mistakes on tricky cases, but that’s practically inevitable. An LLM implemented in an error-proof code synthesis engine would be a breakthrough; and while there are techniques like fine-tuning, I suspect they wouldn’t improve the accuracy much, and certainly would amount to more effort in total (in fact most LLM “applications” are just a custom prompt on plain ChatGPT or another baseline model).

  • I'm betting on Call-by-Push-Value
  • I believe the answer is yes, except that we’re talking about languages with currying, and those can’t represent a zero argument function without the “computation” kind (remember: all functions are Arg -> Ret, and a multi-argument function is just Arg1 -> (Arg2 -> Ret)). In the linked article, all functions are in fact “computations” (the two variants of CompType are Thunk ValType and Fun ValType CompType). The author also describes computations as “a way to add side-effects to values”, and the equivalent in an imperative language to “a value which produces side-effects when read” is either a zero-argument function (getXYZ()), or a “getter” which is just syntax sugar for a zero-argument function.

    The other reason may be that it’s easier in an IR to represent computations as intrinsic types vs. zero-argument closures. Except if all functions are computations, then your “computation” type is already your closure type. So the difference is again only if you’re writing an IR for a language with currying: without CBPV you could just represent closures as things that take one argument, but CBPV permits zero-argument closures.

  • Using Go as a compiler backend?
  • Go as a backend language isn’t super unusual, there’s at least one other project (https://borgo-lang.github.io) which chosen it. And there are many languages which compile to JavaScript or C, but Go strikes a balance between being faster than JavaScript but having memory management vs. C.

    I don’t think panics revealing the Go backend are much of an issue, because true “panics” that aren’t handled by the language itself are always bad. If you compile to LLVM, you must implement your own debug symbols to get nice-looking stack traces and line-by-line debugging like C and Rust, otherwise debugging is impossible and crashes show you raw assembly. Even in Java or JavaScript, core dumps are hard to debug, ugly, and leak internal details; the reason these languages have nice exceptions, is because they implement exceptions and detect errors on their own before they become “panics”, so that when a program crashes in java (like tries to dereference null) it doesn’t crash the JVM. Golang’s backtrace will probably be much nicer than the default of C or LLVM, and you may be able to implement a system like Java which catches most errors and gives your own stacktrace beforehand.

    Elm’s kernel controversy is also something completely different. The problem with Elm is that the language maintainers explicitly prevented people from writing FFI to/from JavaScript except in the maintainers’ own packages, after allowing this feature for a while, so many old packages broke and were unfixable. And there were more issues: the language itself was very limited (meaning JS FFI was essential) and the maintainers’ responses were concerning (see “Why I’m leaving Elm”). Even Rust has features that are only accessible to the standard library and compiler (“nightly”), but they have a mechanism to let you use them if you really want, and none of them are essential like Elm-to-JS FFI, so most people don’t care. Basically, as long as you don’t become very popular and make a massively inconvenient, backwards-incompatible change for purely design reasons, you won’t have this issue: it’s not even “you have to implement Go FFI”, not even “if you do implement Go FFI, don’t restrict it to your own code”, it’s “don’t implement Go FFI and allow it everywhere, become very popular, then suddenly restrict it to your own code with no decent alternatives”.

  • InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)AR
    armchair_progamer @programming.dev
    Posts 248
    Comments 32
    Moderates