Difference between revisions of "User:Hatter/programming principles"
(→Low-Level (Machine)) |
(→Unless) |
||
(63 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
− | + | [[Programming language]]s and miscellaneous command texts of the same nature are broken into 3 categories: | |
+ | * Low-to-mid-level: [[Assembly]] and [[shellcode|machine language]]s | ||
+ | * Mid-level: [[Compiled language]]s | ||
+ | :''[[C]] • [[C++]]'' | ||
+ | * High-level: [[Interpreted languages]] | ||
+ | :''[[perl]] • [[ruby]] • [[php]] • [[python]] • [[lua]]'' | ||
− | + | These three "levels" of languages differ immensely in the way they are translated into executable code in an [[operating system]]'s [[ram|memory]]. Assembly and machine languages can be classified as either mid level or low level due to the fact that both ''embedded code'' or '''real-mode''' code, also called ''[[firmware]]'', and ''[[application]] code'' or '''protected-mode''' code, also called ''[[application|software]]'' can be written in machine language. | |
− | = | + | == Pseudocode == |
− | + | Pseudocode is a human readable expression of the intended functionality of a program. It tends to consist of very high level descriptions of the code in question while leaving out aspects of [[programming language]]s that, while necessary for proper execution, are an unnecessary hindrance to a human's ability to understand the code. Pseudocode is often used in place of actual code during the initial design of a program to allow the writer to outline the function of their program and how they intend to implement it in an easy to understand manner. Pseudocode cannot be compiled nor executed; it is solely for human interpretation. | |
− | + | == Style and organization == | |
− | + | A critical strategy to employ for code organization is the following of previously established standards, which will, more often that not, allow for easier interpretation and [[debugging]] of written code, if not for oneself then for others. This is known as [[programming]] style, the proper use of which can be the difference between a bad hacker, a good hacker, and an amazing hacker. | |
− | + | The vast majority of [[programming language]]s offer the ability to comment one's code through use of a special syntax. It is important that one uses comments to document one's code in any place where the code can be perceived as not being perfectly clear, with "perfectly" being the operative word. It is always useful to think "If I returned to this code in 5 years, would I be able to quickly understand exactly what it was doing?". | |
− | =Variables and Data Types= | + | Variables are something that one will deal with constantly in one's [[programming]] activities. Variables are "names" given to some piece of data and, like the term implies, can vary during the course of one's program. Satisfactory programming style incorporates informative variable names. For example, if one has created a program that took input from a user and changed the first letter of their name, it would be bad style to name that variable "apple", as opposed to "userInput". The ideal is written code that is self-documenting, the organization and variable naming scheme being so good that the final product requires very few comments to explain its execution. |
+ | |||
+ | The primary benefit of good programming style is that clear organization of one's program will often prove invaluable in [[debugging]]. One will generally be able to spot errors at a much faster rate, compared to someone who was careless in crafting their program (resulting in the dreaded "spaghetti code"). Proper use of programming style also makes it easier to update one's code if patching any [[exploitation|exploits]]/[[vulnerability|vulnerabilities]] becomes necessary. | ||
+ | |||
+ | There are sometimes minute customary programming practices which are specific to a particular language (in [https://www.java.com/en/download/faq/whatis_java.xml Java] it is customary to use camelCase to name variables, whereas in [[Python]] one may see variables written as name_of_var), but, like any other stylistic practices, they may be either ignored, violated, or entirely absent in any given program. | ||
+ | |||
+ | It's best to use existing practices within whatever [[application]] is being written, the next best alternative being to develop a standard before allowing others to contribute to one's own code. | ||
+ | |||
+ | == Variables and Data Types == | ||
:''See also: Variables in [[LUA#Variables|LUA]], [[Ruby#Variables|Ruby]], [[PHP#Variables|PHP]], [[Perl variables|Perl]], [[Defining variables in python|Python]], [[C Variables|C]], and [[C++#Variables_and_Data_Types|C++]]'' | :''See also: Variables in [[LUA#Variables|LUA]], [[Ruby#Variables|Ruby]], [[PHP#Variables|PHP]], [[Perl variables|Perl]], [[Defining variables in python|Python]], [[C Variables|C]], and [[C++#Variables_and_Data_Types|C++]]'' | ||
− | Variables, as the name implies, are dynamic | + | Variables, as the name implies, are dynamic user-defined items used in programming, which change based on the context of the code being written, such as the programming language used or the actions of the [[application]]. |
+ | |||
+ | === Low-Level (Machine) === | ||
+ | :''See also: [[binary data]], [[assembly#register|registers]], [[Assembly#Memory_Addressing|memory addressing]]'' | ||
− | + | Any given piece of data that the processor can perform operations on is [[binary]] represented in [[hexadecimal]] that does not exceed the size of a [[register]] for the given [[instruction set architecture]]. | |
− | + | ||
− | + | * If a register is ''b'' bits, this means the highest value that can be placed into a register is 2 multiplied by itself ''b'' times minus 1, or: | |
2^b - 1 | 2^b - 1 | ||
− | Conversely, the highest bit set in any given value can be calculated with : | + | * Conversely, the highest [[bit]] set in any given value can be calculated with : |
log(value)/log(2) - (log(value)/log(2) % 1) | log(value)/log(2) - (log(value)/log(2) % 1) | ||
Line 40: | Line 55: | ||
qword 64 8 | qword 64 8 | ||
− | [[Memory addresses]] may contain offsets, [[virtualization|virtual addresses]], or absolute addresses. These addresses can usually be edited or read in increments of ''b'' | + | [[Memory addresses]] may contain offsets, [[virtualization|virtual addresses]], or absolute addresses. These addresses can usually be edited or read in increments of ''b''. On some x86 [[instruction set architecture]]s, one or more registers can be appended to obtain a larger value. |
− | == Mid-Level == | + | === Mid-Level === |
In mid-level languages, variable types are broken up further into '''integers''', '''strings''', '''character arrays''', '''floating point''', '''booleans''', and more. In these languages, ''arrays'' are simply constructs of a particular type (non-mixed) and ''Associative Arrays'' or ''hashes'' are actually user-defined '''data types'''. | In mid-level languages, variable types are broken up further into '''integers''', '''strings''', '''character arrays''', '''floating point''', '''booleans''', and more. In these languages, ''arrays'' are simply constructs of a particular type (non-mixed) and ''Associative Arrays'' or ''hashes'' are actually user-defined '''data types'''. | ||
'''Integer''': | '''Integer''': | ||
− | * A whole number, positive or negative | + | * A whole number, positive or negative, an integer is four bytes. |
'''Floating point''': | '''Floating point''': | ||
− | * A number with a decimal point | + | * A number with a decimal point, floating point numbers are four bytes. |
'''Character''': | '''Character''': | ||
− | * A single text character, e.g. "h" | + | * A single byte text character, e.g. "h" |
'''String''': | '''String''': | ||
− | * Multiple text characters, e.g. "hello world!" | + | * Multiple text characters, e.g. "hello world!", in most languages strings are null terminated, meaning that they end in \x00. |
− | + | ||
− | + | ||
'''Boolean''': | '''Boolean''': | ||
− | * Can only be true or false. Some languages preprocess these as 1 or 0, respectively. | + | * Can only be true or false. Some languages preprocess these as binary 1 or 0 (one bit), respectively. |
+ | |||
+ | '''References''': | ||
+ | * Also known as pointers, references store the address of data in memory. | ||
− | == High-Level == | + | === High-Level === |
In high-level languages, there are generally three types of variables: '''Scalars''' (single values), '''Arrays''' (or lists), and '''Associative Arrays''' (or hashes). | In high-level languages, there are generally three types of variables: '''Scalars''' (single values), '''Arrays''' (or lists), and '''Associative Arrays''' (or hashes). | ||
'''Scalars''': | '''Scalars''': | ||
− | * A ''scalar'' is the simplest form of a variable, as it is a single object. | + | * A ''scalar'' is the simplest form of a variable, as it is a single object. For example, one may recall a person's name or a single number. These are typically used for basic things like running the same equation (think X + 10 =) when you'll need changing data to test. |
'''Arrays''': | '''Arrays''': | ||
− | * An ''array'' is a bit more complex than a Scalar. Arrays are formed from multiple scalars and can contain scalars or predefined, static strings. | + | * An ''array'' is a bit more complex than a Scalar. Arrays are formed from multiple scalars and can contain scalars or predefined, static strings. One may have a variable for a classroom, which inside lists <i>all</i> the names rather than one. |
'''Associative Arrays''': | '''Associative Arrays''': | ||
− | * An ''associative array'' is similar to an Array, but with two fields that compare each other instead of individual strings. For instance | + | * An ''associative array'' is similar to an Array, but with two fields that compare each other instead of individual strings. For instance one might have an array that compares the classes' first names to to their last names, such that at any point if one knows the person's last name, they can find their first name. |
'''"Mixed" Variables''': | '''"Mixed" Variables''': | ||
− | * The true benefit of variables comes when | + | * The true benefit of variables comes when one combines them. One may have a ''scalar'' called ''number'' to recall an element in ''array'' called ''months'' to pull a month's name by its chronological number, or use a ''scalar'' called ''last_name'' that refers to your ''hash'' called ''full_names'' so that every time one recalls last_name it associates the first name. |
− | == Arithmetic == | + | == Data operations == |
− | :''See also: | + | === Arithmetic === |
+ | :''See also: [[Bitwise math]] and arithmetic in [[CPP#Arithmetic|C++]] and [[Perl#Mathematical|Perl]]'' | ||
* TODO: Some notes about bitwise operations | * TODO: Some notes about bitwise operations | ||
* TODO: Explanation of "Modulus" and "Integer Division" | * TODO: Explanation of "Modulus" and "Integer Division" | ||
− | = Control Flow = | + | === Casting and Type Conversions === |
+ | :''See also:'' | ||
+ | |||
+ | Sometimes variables need to be accessed as a different type than they are; for example, an integer as a string. This can be done by typecasting. | ||
+ | |||
+ | <code> | ||
+ | string str1 = (string)3 | ||
+ | </code> | ||
+ | |||
+ | == Control Flow == | ||
:''See also: Control Flow in [[CPP#Relational|C++]], [[Python#Python_Operators|Python]], [[PHP#Operators|PHP]], and [[Perl#Operators|Perl]]'' | :''See also: Control Flow in [[CPP#Relational|C++]], [[Python#Python_Operators|Python]], [[PHP#Operators|PHP]], and [[Perl#Operators|Perl]]'' | ||
− | ==Validity== | + | Control flow is anything that affects the path a program follows as it runs, examples are conditionals and loops. |
+ | |||
+ | === Conditional Validity === | ||
− | ''Validity'' is a simple concept. If a condition is true, the [[#Conditionals|conditional]] or [[#Loops|loop]] will act accordingly. Likewise | + | ''Validity'' is a simple concept. If a condition is true, the [[#Conditionals|conditional]] or [[#Loops|loop]] will act accordingly. Likewise, if a condition is false (a 0, empty string, or incorrect comparative operator), it will treat the loop or conditional in the opposite manner. |
− | ==Comparative Operators== | + | === Comparative Operators === |
− | Comparative operators are used when | + | Comparative operators are used when one wants to return a true value by comparing two other values. For example, one might have a 7 and a 5. Since they're not empty strings, (7) would always be true, (5) would likewise always be true, but, for example, would 7 <i>less than</i> 5 be true?? Common comparative operators are ''and, or, equal to, not equal to, greater than, greater than or equal to, less than,'' and ''less than or equal to''. The operators return true when the mathematical condition is satisfied. |
<center> | <center> | ||
Line 138: | Line 166: | ||
− | == | + | === Conditional statements === |
− | :''See also: | + | :''See also: Conditional statements in [[C#If|C]], [[CPP#If_.26_Else|C++]], [[PHP#Boolean_Logic|PHP]], [[Perl#Statements|Perl]], [[Python#Statements_and_Loops|Python]], and [[Ruby#Statements|Ruby]]'' |
− | Typically found as ''if, unless, else'', or ''else if'', conditionals take action according to the <i>validity</i> of a condition. That | + | Typically found as ''if, unless, else'', or ''else if'', conditionals take action according to the <i>validity</i> of a condition. That is to say, they either do an action contained in the block of code, or skip past the block altogether depending on the conditional used. |
− | ===If=== | + | ====If==== |
The most common conditional is ''if''. ''If'' your condition returns true, for instance: | The most common conditional is ''if''. ''If'' your condition returns true, for instance: | ||
Line 153: | Line 181: | ||
It proceeds with the block and performs actions. If it returns false, it ignores the block altogether and goes to the next. | It proceeds with the block and performs actions. If it returns false, it ignores the block altogether and goes to the next. | ||
− | ===Unless=== | + | ====Unless==== |
− | ''Unless'' is the exact opposite of if. It only continues with the block if the condition isn't met and returns a false value. | + | ''Unless'' is the exact opposite of ''if''. It only continues with the block if the condition isn't met and returns a false value. It is functionally equivalent to "if not", but is provided as a native in many languages. |
<code> | <code> | ||
− | unless ($variable < $limit | + | unless ($variable < $limit) |
do something | do something | ||
</code> | </code> | ||
− | This conditional can be reworded into an if, and | + | This conditional can be reworded into an ''if'', and one should always use what feels more natural to them ("if exists $variable" is the same as "unless not exists $variable"). |
− | ===Else=== | + | ====Else==== |
− | ''Else'' is plain conditional in that it doesn't test anything except the previous conditional. For instance | + | ''Else'' is plain conditional in that it doesn't test anything except the previous conditional. For instance one could have a statement that reads |
<code> | <code> | ||
Line 172: | Line 200: | ||
do this | do this | ||
else | else | ||
− | print " | + | print "Help, the ceiling's collapsing!" |
</code> | </code> | ||
− | If the previous condition is not satisfied, it will always run. If it | + | If the previous condition is not satisfied, it will always run. If it is, it will never run. |
− | ===Else if=== | + | ====Else if==== |
− | Else if, or more commonly ''elseif'' is exactly what it sounds like: A combination of if and else. It tests the previous condition and then tests a new condition if the previous one was false. A good example of <u>Goldilocks and the Three Bears</u> is: | + | Else if, or more commonly ''elseif'' is exactly what it sounds like: A combination of if and else. It tests the previous condition and then tests a new condition if the previous one was false. A good example out of <u>Goldilocks and the Three Bears</u> is: |
<code> | <code> | ||
if $temperature is greater than 10 | if $temperature is greater than 10 | ||
− | print "This porridge is too hot" | + | print "This porridge is too hot." |
elseif $temperature is less than 10 | elseif $temperature is less than 10 | ||
− | print "This porridge is too cold" | + | print "This porridge is too cold." |
else | else | ||
print "This porridge is just right!" | print "This porridge is just right!" | ||
Line 192: | Line 220: | ||
Only if the first condition isn't already met, it moves on to test the next one. If the next one isn't met, either, the else at the end assumes itself true. | Only if the first condition isn't already met, it moves on to test the next one. If the next one isn't met, either, the else at the end assumes itself true. | ||
− | ===Switch=== | + | ====Switch==== |
− | + | Switch statements are similar to regular conditionals, except that it deals with different values of a single variable. Each conditional in a switch statement is referred to as a "case" statement and the default result, if none of the cases match, is called "default". Each case should be terminated with a break. For example: | |
− | + | <code> | |
− | + | switch (a){ | |
+ | case 1: | ||
+ | do_this() | ||
+ | break | ||
+ | case 2: | ||
+ | do_this_2() | ||
+ | break | ||
+ | default: | ||
+ | default() | ||
+ | } | ||
+ | </code> | ||
− | === | + | == Loops == |
+ | === Basic Loops === | ||
− | ''While'' is very similar to if, generally both in syntax and function. If a condition returns true, it runs the block, and then checks again to see if the block is still true. If it is, it runs it again. This means if | + | :''See also: Loops in [[C#Loops|C]], [[CPP#Loop_Functions|C++]], [[Perl#Loops|Perl]], [[PHP#Loops|PHP]], [[Python#While_Loop|Python]] and [[Ruby#Loops|Ruby]]'' |
+ | Loops are very similar to conditionals in that they test a condition in order to function; unlike conditionals, instead of continuing with the program, they go back to the start of the block they are in until that condition is no longer true. A loop that is always true, either intentionally or not, is called an <i>infinite loop</i> and will run until the program is stopped, or the loop is escaped. To avoid infinite loops, one must change the condition in some way. Generally loops test against variables that are changed inside the loop. $variable++ (The common form for increment, or +1) would be taking a variable and adding a value one; for example, if one has a loop that tests its variable to be less than a number (say 10), adding $variable++ would repeat until it's larger (say 10 times from $variable equaling 0). Common examples of loops are ''while'' and ''until''. | ||
+ | |||
+ | ====While==== | ||
+ | |||
+ | ''While'' is very similar to if, generally both in syntax and function. If a condition returns true, it runs the block, and then checks again to see if the block is still true. If it is, it runs it again. This means if one doesn't do something to change the condition somewhere in the block, it will always be true and run infinitely. 1, since it is not an empty string, false comparative operator, or 0, would always test as true, so ''while (1)'' will perform the same action or set of actions repeatedly. | ||
<code> | <code> | ||
Line 209: | Line 253: | ||
</code> | </code> | ||
− | ===Until=== | + | ====Until==== |
− | ''Until'' is | + | ''Until'' is While's polar opposite. It tests for a false value, and runs the loop until the value tests positive. It is functionally equivalent to "while not", though is provided in many languages as a native. Taking the Goldilocks example again, one could say |
<code> | <code> | ||
Line 218: | Line 262: | ||
</code> | </code> | ||
− | + | In this example, the program keeps testing random numbers until a 10 is reached, at which point one could program what would happen with the best porridge. | |
− | == Iteration == | + | === Iteration === |
− | === For === | + | ==== For ==== |
− | === Foreach === | + | ==== Foreach ==== |
+ | Iteration is similar to looping, in that it performs a task repeatedly, but iteration is usually done for each item in a set (or list). These are usually called for loops or foreach loops. In must languages, a ''for'' loop loops over each item in the list and allows the user to call it with a given name. The following example loops over list ''variables'' and prints each variable in it as ''variable''. | ||
− | + | <code> | |
+ | for variable in variables | ||
+ | print variable | ||
+ | </code> | ||
− | ==Functions== | + | ==Functional Programming== |
+ | |||
+ | ===Functions=== | ||
:''See also: Functions in [[CPP#Functions|C++]], [[PHP#Functions|PHP]], [[Perl#User-Defined_Functions|Perl]], [[Python#Functions|Python]], and [[LUA#Functions|LUA]]'' | :''See also: Functions in [[CPP#Functions|C++]], [[PHP#Functions|PHP]], [[Perl#User-Defined_Functions|Perl]], [[Python#Functions|Python]], and [[LUA#Functions|LUA]]'' | ||
− | A function contains code that will be executed whenever the function is "called." | + | A function contains code that will be executed whenever the function is "called." Functions can be used to prevent redundancy in code and keep it organized. Functions accept arguments and can return a value: this allows a function to be dynamic. For example, one might have a function that accepts two numbers and adds them together and returns the result. This might look like (in pseudo code): |
<code> | <code> | ||
Line 240: | Line 290: | ||
This code would add the two arguments together (in this case a (= 1) and b (= 2)) then it adds them together and returns the result (3), the return value is then printed. | This code would add the two arguments together (in this case a (= 1) and b (= 2)) then it adds them together and returns the result (3), the return value is then printed. | ||
− | ==Recursion== | + | ===Recursion=== |
+ | A recursive function returns by calling itself, similarly to a loop, and when the action it is completing has finished it returns. In the following example, our recursive function solves for a factorial of ''n'' and reduces n and adds it to the total each time the function is called and returns if ''n'' is zero. | ||
+ | |||
<code> | <code> | ||
func factorial(integer n, integer total = 0) | func factorial(integer n, integer total = 0) | ||
Line 252: | Line 304: | ||
</code> | </code> | ||
− | = | + | == Classes and objects == |
:''See also: Objects in [[CPP#Classes|C++]], [[PHP#Classes|PHP]], and [[Python#Classes|Python]]'' | :''See also: Objects in [[CPP#Classes|C++]], [[PHP#Classes|PHP]], and [[Python#Classes|Python]]'' | ||
+ | |||
+ | Object oriented programming allows a programmer to create ''objects'' to simplify programming, for example a programmer might create a "cat" class and a "dog" class. Classes contain properties such as functions (known as methods when used in classes) and variables. | ||
+ | |||
+ | <code> | ||
+ | class cat { | ||
+ | integer size | ||
+ | integer hair_color | ||
+ | function purr | ||
+ | // code to purr | ||
+ | } | ||
+ | class dog { | ||
+ | integer size | ||
+ | integer hair_color | ||
+ | function growl | ||
+ | // code to growl | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | === States === | ||
+ | |||
+ | ==== Static utility classes ==== | ||
+ | |||
+ | ==== Instances ==== | ||
+ | Our cats and dogs each have a size and hair color variable and a purr / growl function, respectively. An instance can then be instantiated and used in this fashion: | ||
+ | |||
+ | <code> | ||
+ | cat kitty | ||
+ | kitty.size = 3 | ||
+ | kitty.pur() | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ==== Extension ==== | ||
+ | Objects also allow for inheritance, so a class can "inherit" all of the functions and variables from another class, so our cat and dog class could become an animal, cat, and dog class. This allows us to prevent redundant code. | ||
+ | |||
+ | <code> | ||
+ | class animal { | ||
+ | private: | ||
+ | integer size | ||
+ | integer hair_color | ||
+ | } | ||
+ | class cat inherits animal { | ||
+ | function purr() | ||
+ | // code to purr | ||
+ | } | ||
+ | class dog inherits animal { | ||
+ | function growl() | ||
+ | // code to growl | ||
+ | } | ||
+ | cat kitty | ||
+ | kitty.size = 3 | ||
+ | dog puppy | ||
+ | puppy.size = 3 | ||
+ | </code> | ||
+ | |||
+ | ==== Abstraction ==== | ||
+ | A class becomes abstract when it can no longer be extended, or when child objects can no longer inherit its functionality or properties. | ||
+ | |||
+ | |||
+ | === Scope === | ||
+ | Objects also allow for permissions. Standard convention suggests that variables should be ''private'' (meaning only the class's functions can access the variables) and functions should be ''public'' (anything can access these elements). A programmer can also create a constructor and destructor for the class (functions called when the class is instantiated or destroyed, respectively). Example: | ||
+ | ====public==== | ||
+ | ====private==== | ||
+ | ====protected==== | ||
+ | <code> | ||
+ | class cat { | ||
+ | private: | ||
+ | integer size | ||
+ | integer hair_color | ||
+ | public: | ||
+ | cat() { // constructor | ||
+ | size = 3 | ||
+ | } | ||
+ | ~cat() { // deconstructor | ||
+ | delete size | ||
+ | } | ||
+ | function purr() | ||
+ | // code to purr | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | === Creation, destruction, and validity === | ||
+ | ==== Accessors ==== | ||
+ | |||
+ | Accessors are functions used to access private variables of a class, this allows a programmer to change aspects of the class without affecting the usage of the class. | ||
+ | |||
+ | <code> | ||
+ | class cat { | ||
+ | private: | ||
+ | integer size | ||
+ | public: | ||
+ | function set_size(new_size) { | ||
+ | size = new_size | ||
+ | } | ||
+ | function get_size() { | ||
+ | return size | ||
+ | } | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | ==== Constructors ==== | ||
+ | ==== Destructors ==== | ||
+ | |||
+ | =Referencing= | ||
+ | :''See also: Referencing in [[Perl#References|Perl]]'' | ||
+ | |||
+ | Referencing allows data to be passed by reference, instead of by value. This allows programs to be much faster, as data does not have to be copied for every function call. In most languages, this happens implicitly, as it is more efficient. However, lower level languages, such as C, generally require a reference to be passed explicitly. A pointer in C is designated with a asterisk (*), and can be "dereferenced" (accessing the value at the address) with another asterisk. For example: | ||
+ | |||
+ | <code> | ||
+ | int value = 3; | ||
+ | int *ptr = &value; | ||
+ | print(*ptr); | ||
+ | </code> | ||
+ | |||
+ | === Anonymous References === | ||
+ | |||
+ | Many languages have the ability to call a function by its name rather than its address, for example ''"function_name"()'' vs ''function_name()''. This allows a programmer to dynamically call functions. These are also called [[anonymous function calls]]. | ||
+ | |||
+ | === Functionally anonymous collection handling === |
Latest revision as of 21:06, 16 May 2013
Programming languages and miscellaneous command texts of the same nature are broken into 3 categories:
- Low-to-mid-level: Assembly and machine languages
- Mid-level: Compiled languages
- High-level: Interpreted languages
These three "levels" of languages differ immensely in the way they are translated into executable code in an operating system's memory. Assembly and machine languages can be classified as either mid level or low level due to the fact that both embedded code or real-mode code, also called firmware, and application code or protected-mode code, also called software can be written in machine language.
Contents
Pseudocode
Pseudocode is a human readable expression of the intended functionality of a program. It tends to consist of very high level descriptions of the code in question while leaving out aspects of programming languages that, while necessary for proper execution, are an unnecessary hindrance to a human's ability to understand the code. Pseudocode is often used in place of actual code during the initial design of a program to allow the writer to outline the function of their program and how they intend to implement it in an easy to understand manner. Pseudocode cannot be compiled nor executed; it is solely for human interpretation.
Style and organization
A critical strategy to employ for code organization is the following of previously established standards, which will, more often that not, allow for easier interpretation and debugging of written code, if not for oneself then for others. This is known as programming style, the proper use of which can be the difference between a bad hacker, a good hacker, and an amazing hacker.
The vast majority of programming languages offer the ability to comment one's code through use of a special syntax. It is important that one uses comments to document one's code in any place where the code can be perceived as not being perfectly clear, with "perfectly" being the operative word. It is always useful to think "If I returned to this code in 5 years, would I be able to quickly understand exactly what it was doing?".
Variables are something that one will deal with constantly in one's programming activities. Variables are "names" given to some piece of data and, like the term implies, can vary during the course of one's program. Satisfactory programming style incorporates informative variable names. For example, if one has created a program that took input from a user and changed the first letter of their name, it would be bad style to name that variable "apple", as opposed to "userInput". The ideal is written code that is self-documenting, the organization and variable naming scheme being so good that the final product requires very few comments to explain its execution.
The primary benefit of good programming style is that clear organization of one's program will often prove invaluable in debugging. One will generally be able to spot errors at a much faster rate, compared to someone who was careless in crafting their program (resulting in the dreaded "spaghetti code"). Proper use of programming style also makes it easier to update one's code if patching any exploits/vulnerabilities becomes necessary.
There are sometimes minute customary programming practices which are specific to a particular language (in Java it is customary to use camelCase to name variables, whereas in Python one may see variables written as name_of_var), but, like any other stylistic practices, they may be either ignored, violated, or entirely absent in any given program.
It's best to use existing practices within whatever application is being written, the next best alternative being to develop a standard before allowing others to contribute to one's own code.
Variables and Data Types
Variables, as the name implies, are dynamic user-defined items used in programming, which change based on the context of the code being written, such as the programming language used or the actions of the application.
Low-Level (Machine)
- See also: binary data, registers, memory addressing
Any given piece of data that the processor can perform operations on is binary represented in hexadecimal that does not exceed the size of a register for the given instruction set architecture.
- If a register is b bits, this means the highest value that can be placed into a register is 2 multiplied by itself b times minus 1, or:
2^b - 1
- Conversely, the highest bit set in any given value can be calculated with :
log(value)/log(2) - (log(value)/log(2) % 1)
The reason for subtracting one after calculating the power of two representing b is that in binary, zero counts as a number. This is the reason that programmers count from zero, loops are initialized at zero, etc.
Values are typically referred to in groups of bits:
term bits bytes nibble 4 1/2 byte 8 1 word 16 2 dword 32 4 qword 64 8
Memory addresses may contain offsets, virtual addresses, or absolute addresses. These addresses can usually be edited or read in increments of b. On some x86 instruction set architectures, one or more registers can be appended to obtain a larger value.
Mid-Level
In mid-level languages, variable types are broken up further into integers, strings, character arrays, floating point, booleans, and more. In these languages, arrays are simply constructs of a particular type (non-mixed) and Associative Arrays or hashes are actually user-defined data types.
Integer:
- A whole number, positive or negative, an integer is four bytes.
Floating point:
- A number with a decimal point, floating point numbers are four bytes.
Character:
- A single byte text character, e.g. "h"
String:
- Multiple text characters, e.g. "hello world!", in most languages strings are null terminated, meaning that they end in \x00.
Boolean:
- Can only be true or false. Some languages preprocess these as binary 1 or 0 (one bit), respectively.
References:
- Also known as pointers, references store the address of data in memory.
High-Level
In high-level languages, there are generally three types of variables: Scalars (single values), Arrays (or lists), and Associative Arrays (or hashes).
Scalars:
- A scalar is the simplest form of a variable, as it is a single object. For example, one may recall a person's name or a single number. These are typically used for basic things like running the same equation (think X + 10 =) when you'll need changing data to test.
Arrays:
- An array is a bit more complex than a Scalar. Arrays are formed from multiple scalars and can contain scalars or predefined, static strings. One may have a variable for a classroom, which inside lists all the names rather than one.
Associative Arrays:
- An associative array is similar to an Array, but with two fields that compare each other instead of individual strings. For instance one might have an array that compares the classes' first names to to their last names, such that at any point if one knows the person's last name, they can find their first name.
"Mixed" Variables:
- The true benefit of variables comes when one combines them. One may have a scalar called number to recall an element in array called months to pull a month's name by its chronological number, or use a scalar called last_name that refers to your hash called full_names so that every time one recalls last_name it associates the first name.
Data operations
Arithmetic
- See also: Bitwise math and arithmetic in C++ and Perl
- TODO: Some notes about bitwise operations
- TODO: Explanation of "Modulus" and "Integer Division"
Casting and Type Conversions
- See also:
Sometimes variables need to be accessed as a different type than they are; for example, an integer as a string. This can be done by typecasting.
string str1 = (string)3
Control Flow
Control flow is anything that affects the path a program follows as it runs, examples are conditionals and loops.
Conditional Validity
Validity is a simple concept. If a condition is true, the conditional or loop will act accordingly. Likewise, if a condition is false (a 0, empty string, or incorrect comparative operator), it will treat the loop or conditional in the opposite manner.
Comparative Operators
Comparative operators are used when one wants to return a true value by comparing two other values. For example, one might have a 7 and a 5. Since they're not empty strings, (7) would always be true, (5) would likewise always be true, but, for example, would 7 less than 5 be true?? Common comparative operators are and, or, equal to, not equal to, greater than, greater than or equal to, less than, and less than or equal to. The operators return true when the mathematical condition is satisfied.
Comparative Operators and Symbols | ||
---|---|---|
Literal | Shorthand | Symbolic |
And | and | && |
Or | or | || |
Equal to | eq | == |
Not Equal to | ne | != |
Greater than | gt | > |
Greater than or equal to | ge | >= |
Less than | lt | < |
Less than or equal to | le | <= |
Conditional statements
Typically found as if, unless, else, or else if, conditionals take action according to the validity of a condition. That is to say, they either do an action contained in the block of code, or skip past the block altogether depending on the conditional used.
If
The most common conditional is if. If your condition returns true, for instance:
if (5 < 7) or if (5 != 7)
It proceeds with the block and performs actions. If it returns false, it ignores the block altogether and goes to the next.
Unless
Unless is the exact opposite of if. It only continues with the block if the condition isn't met and returns a false value. It is functionally equivalent to "if not", but is provided as a native in many languages.
unless ($variable < $limit) do something
This conditional can be reworded into an if, and one should always use what feels more natural to them ("if exists $variable" is the same as "unless not exists $variable").
Else
Else is plain conditional in that it doesn't test anything except the previous conditional. For instance one could have a statement that reads
unless this variable is less than a limit do this else print "Help, the ceiling's collapsing!"
If the previous condition is not satisfied, it will always run. If it is, it will never run.
Else if
Else if, or more commonly elseif is exactly what it sounds like: A combination of if and else. It tests the previous condition and then tests a new condition if the previous one was false. A good example out of Goldilocks and the Three Bears is:
if $temperature is greater than 10 print "This porridge is too hot." elseif $temperature is less than 10 print "This porridge is too cold." else print "This porridge is just right!"
Only if the first condition isn't already met, it moves on to test the next one. If the next one isn't met, either, the else at the end assumes itself true.
Switch
Switch statements are similar to regular conditionals, except that it deals with different values of a single variable. Each conditional in a switch statement is referred to as a "case" statement and the default result, if none of the cases match, is called "default". Each case should be terminated with a break. For example:
switch (a){ case 1: do_this() break case 2: do_this_2() break default: default() }
Loops
Basic Loops
Loops are very similar to conditionals in that they test a condition in order to function; unlike conditionals, instead of continuing with the program, they go back to the start of the block they are in until that condition is no longer true. A loop that is always true, either intentionally or not, is called an infinite loop and will run until the program is stopped, or the loop is escaped. To avoid infinite loops, one must change the condition in some way. Generally loops test against variables that are changed inside the loop. $variable++ (The common form for increment, or +1) would be taking a variable and adding a value one; for example, if one has a loop that tests its variable to be less than a number (say 10), adding $variable++ would repeat until it's larger (say 10 times from $variable equaling 0). Common examples of loops are while and until.
While
While is very similar to if, generally both in syntax and function. If a condition returns true, it runs the block, and then checks again to see if the block is still true. If it is, it runs it again. This means if one doesn't do something to change the condition somewhere in the block, it will always be true and run infinitely. 1, since it is not an empty string, false comparative operator, or 0, would always test as true, so while (1) will perform the same action or set of actions repeatedly.
while 1 is less than 10 do this do this after all that
Until
Until is While's polar opposite. It tests for a false value, and runs the loop until the value tests positive. It is functionally equivalent to "while not", though is provided in many languages as a native. Taking the Goldilocks example again, one could say
until a random number between 1 and 20 is 10 print "I'm not eating that!"
In this example, the program keeps testing random numbers until a 10 is reached, at which point one could program what would happen with the best porridge.
Iteration
For
Foreach
Iteration is similar to looping, in that it performs a task repeatedly, but iteration is usually done for each item in a set (or list). These are usually called for loops or foreach loops. In must languages, a for loop loops over each item in the list and allows the user to call it with a given name. The following example loops over list variables and prints each variable in it as variable.
for variable in variables print variable
Functional Programming
Functions
A function contains code that will be executed whenever the function is "called." Functions can be used to prevent redundancy in code and keep it organized. Functions accept arguments and can return a value: this allows a function to be dynamic. For example, one might have a function that accepts two numbers and adds them together and returns the result. This might look like (in pseudo code):
func add(integer a, integer b) return a + b; print add(1, 2);
This code would add the two arguments together (in this case a (= 1) and b (= 2)) then it adds them together and returns the result (3), the return value is then printed.
Recursion
A recursive function returns by calling itself, similarly to a loop, and when the action it is completing has finished it returns. In the following example, our recursive function solves for a factorial of n and reduces n and adds it to the total each time the function is called and returns if n is zero.
func factorial(integer n, integer total = 0) total = total + n n = n - 1 if (n > 0) return factorial(n,total)) return n
print factorial(10);
Classes and objects
Object oriented programming allows a programmer to create objects to simplify programming, for example a programmer might create a "cat" class and a "dog" class. Classes contain properties such as functions (known as methods when used in classes) and variables.
class cat { integer size integer hair_color function purr // code to purr } class dog { integer size integer hair_color function growl // code to growl }
States
Static utility classes
Instances
Our cats and dogs each have a size and hair color variable and a purr / growl function, respectively. An instance can then be instantiated and used in this fashion:
cat kitty kitty.size = 3 kitty.pur()
Extension
Objects also allow for inheritance, so a class can "inherit" all of the functions and variables from another class, so our cat and dog class could become an animal, cat, and dog class. This allows us to prevent redundant code.
class animal { private: integer size integer hair_color } class cat inherits animal { function purr() // code to purr } class dog inherits animal { function growl() // code to growl } cat kitty kitty.size = 3 dog puppy puppy.size = 3
Abstraction
A class becomes abstract when it can no longer be extended, or when child objects can no longer inherit its functionality or properties.
Scope
Objects also allow for permissions. Standard convention suggests that variables should be private (meaning only the class's functions can access the variables) and functions should be public (anything can access these elements). A programmer can also create a constructor and destructor for the class (functions called when the class is instantiated or destroyed, respectively). Example:
public
private
protected
class cat { private: integer size integer hair_color public: cat() { // constructor size = 3 } ~cat() { // deconstructor delete size } function purr() // code to purr }
Creation, destruction, and validity
Accessors
Accessors are functions used to access private variables of a class, this allows a programmer to change aspects of the class without affecting the usage of the class.
class cat { private: integer size public: function set_size(new_size) { size = new_size } function get_size() { return size } }
Constructors
Destructors
Referencing
- See also: Referencing in Perl
Referencing allows data to be passed by reference, instead of by value. This allows programs to be much faster, as data does not have to be copied for every function call. In most languages, this happens implicitly, as it is more efficient. However, lower level languages, such as C, generally require a reference to be passed explicitly. A pointer in C is designated with a asterisk (*), and can be "dereferenced" (accessing the value at the address) with another asterisk. For example:
int value = 3; int *ptr = &value; print(*ptr);
Anonymous References
Many languages have the ability to call a function by its name rather than its address, for example "function_name"() vs function_name(). This allows a programmer to dynamically call functions. These are also called anonymous function calls.