Programming Language

From Hegemon Wiki
Revision as of 01:22, 21 December 2016 by H3g3m0n (talk | contribs)
Jump to navigation Jump to search

Binary Representation of Source Code

  • Represents the source code. Is not any kind of executable code (ASM/bytecode). Not an IR.
  • Take any valid text source code, turn it into the binary representation and back again and end up with the same byte for byte file.
  • Not storing individual token (ie no LEFT_BRACE). But do need to keep things like whitespace and comments.
  • Edit source code not text.
  • But still allows for people to use standard text editors.
  • Also allows for non-text sourcecode specific editors.
    • Quick and efficient editing of the binary format (ie quickgo/quickrust concept programs).
    • Graphically represent source code (not the same as a graphical programming language, ie blocky, just an eaiser way to see read code).
    • Having things like frames around things like data structures and function definitions.
    • Could have UML like representations (Not advocating for UML specifically, but it's a possibility).
    • Easy/quick navigation of source code. Things like goto definition would be much easier to represent.
  • Makes tooling much easier.
  • Would be easier with a well defined syntax for the source code (ie define tabs vs spaces, number of newlines between functions).
  • But might be better to just store tabs/spaces and newlines in the binary format.
  • Down side, any time you have an invalid syntax everything breaks. But that happens anyway with normal code...
  • Could use a virtual filesystem to automatically convert stored binary to text or visa versa.
  • Any text you edit could basically have any syntax you like, although obviously a standardised version would be best.
  • Could allow for syntax changes.

Schemas not 'data structures'

  • struct definitions are normally mixed in with the the procedural instruction source code.
  • Structures are a binding of **data types** to **variable names**.
  • Separate the **representation** from the **implementation**.
    • Standard native in memory with the same performance and so on.
      • Allow for separate memory layout. Some arch (for example Cell processors require memory padding).
      • In memory ordering.
      • Endianness?.
    • Serilization.
    • Database backed.
  • Older OOP languages like C++ and Java also bind **methods/member functions** to **data structures**.
  • Newer languages like Rust and Go move away from OOP and use interfaces (ie traits) primarily.
  • Conceptually design it as **API's** not bound functions.
    • Allow for standard native function calls, or RPC calls, etc... IPC or networked.
    • Some kind of distributed backend (Raft, Blockchain)
  • Could allow security definitions in the schema. Ie, who can edit this variable. Allows you to separate the security implementation stuff from the data structure.

API Versioning

  • Function definitions and the like could be tracked, and breaking changes to syntax noted automatically.
    • Allow adding fields with defaults
    • Allow optional named arguments.

Everything a library