Revision as of 01:45, 21 December 2016

Binary Representation of Source Code

Represents the source code. Is not any kind of executable code (ASM/bytecode). Not an IR.
Take any valid text source code, turn it into the binary representation and back again and end up with the same byte for byte file.
Not storing individual token (ie no LEFT_BRACE). But do need to keep things like whitespace and comments.
Edit source code not text.
But still allows for people to use standard text editors.
Also allows for non-text sourcecode specific editors.
- Quick and efficient editing of the binary format (ie quickgo/quickrust concept programs).
- Graphically represent source code (not the same as a graphical programming language, ie blocky, just an eaiser way to see read code).
- Having things like frames around things like data structures and function definitions.
- Could have UML like representations (Not advocating for UML specifically, but it's a possibility).
- Easy/quick navigation of source code. Things like goto definition would be much easier to represent.
Makes tooling much easier. Can allow for libraries for manipulation of the code that tooling can use.
Down side, any time you have an invalid syntax everything breaks. But that happens anyway with normal code...
Could use a virtual filesystem to automatically convert stored binary to text or visa versa.
Any text you edit could basically have any syntax you like, although obviously a standardised version would be best.
Could allow for syntax changes.
Could allow for special keywords for editing with a basic text editor (ie 'def myfunctionname' could be hooked to actually insert a function definition nearby on file save and the 'def' keyword removed).
Would be easier with a well defined syntax for the source code (ie define tabs vs spaces, number of newlines between functions).
But might be better to just store tabs/spaces and newlines in the binary format.

Schemas not 'data structures'

struct definitions are normally mixed in with the the procedural instruction source code.
Structures are a binding of **data types** to **variable names**.
Separate the **representation** from the **implementation**.
- Standard native in memory with the same performance and so on.
  - Allow for separate memory layout. Some arch (for example Cell processors require memory padding).
  - In memory ordering.
  - Endianness?.
- Serilization.
- Database backed.
Older OOP languages like C++ and Java also bind **methods/member functions** to **data structures**.
Newer languages like Rust and Go move away from OOP and use interfaces (ie traits) primarily.
Conceptually design it as **API's** not bound functions.
- Allow for standard native function calls, or RPC calls, etc... IPC or networked.
- Some kind of distributed backend (Raft, Blockchain)
Could allow security definitions in the schema. Ie, who can edit this variable. Allows you to separate the security implementation stuff from the data structure.

API Versioning

API version as a hash of the binary representation of the API?
Function definitions and the like could be tracked, and breaking changes to syntax noted automatically.
- Allow adding fields with defaults without api change.
- Allow optional named arguments.
Implementation stuff is harder (ie we changed the format of the string this function returns but the signature is the same).
- Changing the implementation doesn't mean the result is different (ie optimisation).
- Changing the implementation of a function could accidentally change the result (bug). Being told when that happens is handy.
- Allow specifying functions for specific API versions so if you do change the implementation you can keep backwards compatibility.
- How do consumers choose which version (ie specific version they used, or 'latest'?)... Compiled binaries could keep a list of the api version used.
- Functions that have no source code changes can be safely ignored.
- Unit tests could provide a hint. (ie if this unit test changed...), but doing something like adding an extra test or changing the order doesn't mean the implementation's result is different.
- Automatic 'quickcheck' when possible? Compiler can implement a unittest with no effort from the programmer and log results. But you won't know when it's possible (ie halting problem, use of globals/statics, side effects, etc...). Maybe just best effort (ie if it didn't finish in 1 second and/or used more than 512kb of ram, kill the test). Don't store the result of tests that returns a lot of stuff. Do store the meta information about killed tests and the number of items returned (or even better a hash of the items returned, pointers would be a pain though...).
- 'quickbench'? To benchmark performance? Obvious problems of different hardware but could still be useful.

@@ Line 38: / Line 38: @@
 =API Versioning=
+* API version as a hash of the binary representation of the API?
 * Function definitions and the like could be tracked, and breaking changes to syntax noted automatically.
-** Allow adding fields with defaults
+** Allow adding fields with defaults without api change.
 ** Allow optional named arguments.
+* Implementation stuff is harder (ie we changed the format of the string this function returns but the signature is the same).
+** Changing the implementation doesn't mean the result is different (ie optimisation).
+** Changing the implementation of a function could accidentally change the result (bug). Being told when that happens is handy.
+** Allow specifying functions for specific API versions so if you do change the implementation you can keep backwards compatibility.
+** How do consumers choose which version (ie specific version they used, or 'latest'?)... Compiled binaries could keep a list of the api version used.
+** Functions that have no source code changes can be safely ignored.
+** Unit tests could provide a hint. (ie if this unit test changed...), but doing something like adding an extra test or changing the order doesn't mean the implementation's result is different.
+** Automatic 'quickcheck' when possible? Compiler can implement a unittest with no effort from the programmer and log results. But you won't know when it's possible (ie halting problem, use of globals/statics, side effects, etc...). Maybe just best effort (ie if it didn't finish in 1 second and/or used more than 512kb of ram, kill the test). Don't store the result of tests that returns a lot of stuff. Do store the meta information about killed tests and the number of items returned (or even better a hash of the items returned, pointers would be a pain though...).
+** 'quickbench'? To benchmark performance? Obvious problems of different hardware but could still be useful.
 =Everything a library=

Programming Language: Difference between revisions

Revision as of 01:45, 21 December 2016

Contents

Binary Representation of Source Code

Schemas not 'data structures'

API Versioning

Everything a library

Navigation menu

Programming Language: Difference between revisions

Revision as of 01:45, 21 December 2016

Binary Representation of Source Code

Schemas not 'data structures'

API Versioning

Everything a library

Navigation menu

Search