Abstract syntax is a way for computer programmers to map out the structure of the program they want to create without worrying about the actual code needed to implement it. It allows the programmer to focus on what the program needs to do before focusing on how to get the computer to actually perform the desired functions. The abstract syntax outlines the program’s specific tasks, such as adding two numbers together, and shows what types of data can be used within that program. Once the abstract mapping is complete, an abstract syntax tree is drawn, which matches the abstract concepts with concrete syntax — the actual symbols a programmer needs to type out to run the program she is creating.
The idea for abstract syntax is to focus on data types and their relations without getting caught up in the details of how to code them. Computer code is much different than human language, and trying to think in these terms is difficult. Instead, programmers make a list of the steps the program needs to complete and then use concrete syntax to match up the abstract terms with the computer code terms that perform those steps. Often, the programmer will include data types in her abstract markup to show what types of data — be it numbers, letters, or decimals — the program can work with. Specific data types are not required at this stage in programming, however, and the programmer may choose to use abstract data types, which are purely theoretical and will be replaced with specific data types when the program is written.
This sort of abstract idea of programming is often used in compiler theory. Computers can only understand two values: 1s and 0s. This is known as binary code. For the computer to understand a program written in a programming language, it must compile, or translate, the words and letters into a stream of 1s and 0s. Compilers are complex to create and mapping out a vague or abstract idea of what they need to do helps a programmer plan out error-free code.
When the programmer wants to map the abstract syntax to concrete syntax and start coding the program or compiler, she creates an abstract syntax tree. This is simply a list of all the abstract instructions she’s written, such as “add 2 variables,” with a line drawn from each abstract term to the specific line of code needed to execute that instruction. The programmer can use any abstract terms she wants, but its more common to use well-known code terms like “var” for variable and “int” for integer.