Binary Representation of Source Code

Represents the source code. Is not any kind of executable code (ASM/bytecode). Not an IR.
Take any valid text source code, turn it into the binary representation and back again and end up with the same byte for byte file.
Not storing individual token (ie no LEFT_BRACE). But do need to keep things like whitespace and comments.
Edit source code not text.
But still allows for people to use standard text editors.
Also allows for non-text sourcecode specific editors.
- Quick and efficient editing of the binary format (ie quickgo/quickrust concept programs).
- Graphically represent source code (not the same as a graphical programming language, ie blocky, just an eaiser way to see read code).
- Having things like frames around things like data structures and function definitions.
- Could have UML like representations (Not advocating for UML specifically, but it's a possibility).
- Easy/quick navigation of source code. Things like goto definition would be much easier to represent.
Makes tooling much easier. Can allow for libraries for manipulation of the code that tooling can use.
Down side, any time you have an invalid syntax everything breaks. But that happens anyway with normal code...
Could use a virtual filesystem to automatically convert stored binary to text or visa versa.
Any text you edit could basically have any syntax you like, although obviously a standardised version would be best.
Could allow for syntax changes.
Could allow for special keywords for editing with a basic text editor (ie 'def myfunctionname' could be hooked to actually insert a function definition nearby on file save and the 'def' keyword removed).
Would be easier with a well defined syntax for the source code (ie define tabs vs spaces, number of newlines between functions).
But might be better to just store tabs/spaces and newlines in the binary format.

Schemas not 'data structures'

struct definitions are normally mixed in with the the procedural instruction source code.
Structures are a binding of **data types** to **variable names**.
Separate the **representation** from the **implementation**.
- Standard native in memory with the same performance and so on.
  - Allow for separate memory layout. Some arch (for example Cell processors require memory padding).
  - In memory ordering.
  - Endianness?.
- Serilization.
- Database backed.
Older OOP languages like C++ and Java also bind **methods/member functions** to **data structures**.
Newer languages like Rust and Go move away from OOP and use interfaces (ie traits) primarily.
Conceptually design it as **API's** not bound functions.
- Allow for standard native function calls, or RPC calls, etc... IPC or networked.
- Some kind of distributed backend (Raft, Blockchain)
Could allow security definitions in the schema. Ie, who can edit this variable. Allows you to separate the security implementation stuff from the data structure.

API Versioning

Function definitions and the like could be tracked, and breaking changes to syntax noted automatically.
- Allow adding fields with defaults
- Allow optional named arguments.

Programming Language

Contents

Binary Representation of Source Code

Schemas not 'data structures'

API Versioning

Everything a library

Navigation menu

Programming Language

Binary Representation of Source Code

Schemas not 'data structures'

API Versioning

Everything a library

Navigation menu

Search