Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "User:Hatter/programming principles"

From NetSec
Jump to: navigation, search
(Languages)
(Languages)
Line 1: Line 1:
 
== Languages ==
 
== Languages ==
 
Programming languages and their nature are broken into 3 categories:
 
Programming languages and their nature are broken into 3 categories:
* Low-to-mid-level: [[Assembly]] and [[machine language]]s
+
* Low-to-mid-level: [[Assembly]] and [[shellcode|machine language]]s
 
* Mid-level: Compiled languages
 
* Mid-level: Compiled languages
 
:''[[C]]'' • [[C++]]
 
:''[[C]]'' • [[C++]]

Revision as of 20:13, 30 November 2012

Languages

Programming languages and their nature are broken into 3 categories:

CC++
  • High-level: Interpreted languages
perlrubyphppython

These three types of languages differ relatively extremely in the way that they become executable code. Assembly and machine languages can be either mid level or low level due to the fact that embedded code, or real-mode code (firmware) can be written in machine language, but so can application code, or protected-mode code (software).

Pseudocode

Pseudocode is a human readable expression of the intended functionality of a program. It tends to consist of very high level descriptions of the code it is describing while leaving out aspects of programming languages that, while necessary for computers, are unneeded for human understanding. Pseudocode is often used in place of actual code during the initial design of a program to allow the writer to outline the function of his program and how he intends to implement it in an easy to understand manner. Pseudocode cannot be compiled nor executed, it is purely human language oriented.

Style and organization

A critical strategy to employ for code organization is to follow the standards put forth by programmers who came before you. This is known as programming style and can spell the difference between a bad hacker, good hacker, and an amazing hacker.

The vast majority of programming languages offer the ability to comment your code by using special syntax. It is important that you use comments to document your code in any place it is not clear. It is always useful to think "If I returned to this code in 20 years to improve it, would I be able to tell what it was doing?".

Variables are something that you will deal with constantly in your programming activities, these are names given to some piece of data and like the name implies can vary during the life of your program. Good programming style incorporates useful variable names. In other words, if you had a program that took input from a user then changed the first letter of their name it would be bad style to name that variable "apple" as opposed to "userInput". Something to strive for is code that is self-documenting, the organization and variable naming being so good that it requires few comments to explain it's execution.

A major benefit of keeping good programming style is that the clear organization of your program will aid in spotting errors at a much faster rate than someone who was careless in crafting their program (resulting in the dreaded "spaghetti code"). Not only does it help in debugging but, it will also make it easier to update your code to patch any exploits/vulnerabilities. There exist many other minute programming styles which are specific to the language you are learning (in Java it is customary to use camelCase to name variables while in Python you will see variables written as name_of_var), but if you switch between projects you may find the inverse. It's best to use existing practices within whatever application being written, or develop a standard before allowing others to contribute to your own code.


Variables and Data Types

See also: Variables in LUA, Ruby, PHP, Perl, Python, C, and C++

Variables, as the name implies, are dynamic user-defined items in programming which change based on the context of the programming language or the actions of the application.

Low-Level (Machine)

See also: binary data, registers, memory addressing

Any given piece of data that the processor can perform operations on is binary represented in hexadecimal that does not exceed the size of a register for the given instruction set architecture.

  • If a register is b bits, this means the highest value that can be placed into a register is 2 times itself b times minus 1, or:
 2^b - 1
  • Conversely, the highest bit set in any given value can be calculated with :
log(value)/log(2) - (log(value)/log(2) % 1)

The reason for subtracting one after calculating the power of two representing b is that in binary, zero counts as a number. This is the reason that programmers count from zero, loops are initialized at zero, etc.

Values are typically referred to in groups of bits:

 term      bits      bytes     
nibble      4         1/2
 byte       8          1 
 word      16          2
dword      32          4
qword      64          8

Memory addresses may contain offsets, virtual addresses, or absolute addresses. These addresses can usually be edited or read in increments of b, but on some x86 architectures, one or more registers can be appended to obtain a larger value.

Mid-Level

In mid-level languages, variable types are broken up further into integers, strings, character arrays, floating point, booleans, and more. In these languages, arrays are simply constructs of a particular type (non-mixed) and Associative Arrays or hashes are actually user-defined data types.

Integer:

  • A whole number, positive or negative, an integer is four bytes.

Floating point:

  • A number with a decimal point, floating point numbers are four bytes.

Character:

  • A single byte text character, e.g. "h"

String:

  • Multiple text characters, e.g. "hello world!", in most languages strings are null terminated, meaning that they end in \x00.

Boolean:

  • Can only be true or false. Some languages preprocess these as binary 1 or 0 (one bit), respectively.

References:

  • Also known as pointers, references store the address of data in memory.

High-Level

In high-level languages, there are generally three types of variables: Scalars (single values), Arrays (or lists), and Associative Arrays (or hashes).

Scalars:

  • A scalar is the simplest form of a variable, as it is a single object. You may recall a person's name or a single number. These are typically used for basic things like running the same equation (think X + 10 =) when you'll need changing data to test.

Arrays:

  • An array is a bit more complex than a Scalar. Arrays are formed from multiple scalars and can contain scalars or predefined, static strings. You may have a variable for a classroom, which inside lists all the names rather than one.

Associative Arrays:

  • An associative array is similar to an Array, but with two fields that compare each other instead of individual strings. For instance you might have an array that compares the classes first names to last so that at any point if you know the person's last name, you can find their first.

"Mixed" Variables:

  • The true benefit of variables comes when you combine them. You may have a scalar called number to recall an element in array called months to pull a month's name by its chronological number, or use a scalar called last_name that refers to your hash called full_names so that every time you recall last_name it associates the first.

Data operations

Arithmetic

See also: Arithmetic in C++ and Perl
  • TODO: Some notes about bitwise operations
  • TODO: Explanation of "Modulus" and "Integer Division"

Casting and Type Conversions

See also:

Sometimes variables need to be accessed as a different type than they are, for example an integer as a string. This can be done by typecasting.

 string str1 = (string)3

Control Flow

See also: Control Flow in C++, Python, PHP, and Perl

Control flow is anything that affects the path a program follows as it runs, examples are conditionals and loops.

Conditional Validity

Validity is a simple concept. If a condition is true, the conditional or loop will act accordingly. Likewise if it's false, that is a 0, empty string, or incorrect comparative operator, it will treat the loop or conditional in the opposite manner.

Comparative Operators

Comparative operators are used when you want to return a true value by comparing two other values. You might have a 7 and a 5. Since they're not empty strings, (7) would always be true. (5) would likewise be true, but is 7 less than 5? Common comparative operators are and, or, equal to, not equal to, greater than, greater than or equal to, less than, and less than or equal to. They return true when the mathematical value is satisfied.

Comparative Operators and Symbols
Literal Shorthand Symbolic
And and &&
Or or ||
Equal to eq ==
Not Equal to ne !=
Greater than gt >
Greater than or equal to ge >=
Less than lt <
Less than or equal to le <=


Conditional statements

See also: Conditional statements in C, C++, PHP, Perl, Python, and Ruby

Typically found as if, unless, else, or else if, conditionals take action according to the validity of a condition. That means they either do an action contained in the block of code, or skip past the block altogether depending on the conditional used.

If

The most common conditional is if. If your condition returns true, for instance:

 if (5 < 7) or if (5 != 7)

It proceeds with the block and performs actions. If it returns false, it ignores the block altogether and goes to the next.

Unless

Unless is the exact opposite of if. It only continues with the block if the condition isn't met and returns a false value. It is functionally equivilent to "if not", but is provided as a native in many languages.

 unless ($variable < $limit)) 
   do something

This conditional can be reworded into an if, and you should use what feels more natural to you ("if exists $variable" is the same as "unless not exists $variable")

Else

Else is plain conditional in that it doesn't test anything except the previous conditional. For instance you could have a statement that reads

 unless this variable is less than a limit
   do this
 else
   print "You're not allowed!" 

If the previous condition is not satisfied, it will always run. If it was, it will never run.

Else if

Else if, or more commonly elseif is exactly what it sounds like: A combination of if and else. It tests the previous condition and then tests a new condition if the previous one was false. A good example of Goldilocks and the Three Bears is:

 if $temperature is greater than 10
   print "This porridge is too hot" 
 elseif $temperature is less than 10 
   print "This porridge is too cold" 
 else 
   print "This porridge is just right!" 

Only if the first condition isn't already met, it moves on to test the next one. If the next one isn't met, either, the else at the end assumes itself true.

Switch

Switch statements are similar to regular conditionals, except that it deals with different values of a single variable. Each conditional in a switch statement is referred to as a "case" statement and the default result if none of the cases match is called "default". Each case should be terminated with a break. For example:

 switch (a){
   case 1:
     do_this()
     break
   case 2:
     do_this_2()
     break
   default:
     default()
 }

Loops

Basic Loops

See also: Loops in C, C++, Perl, PHP, and Ruby

Loops are very similar to conditionals in that they test a condition in order to function, but unlike conditionals instead of continuing with the program, they go back to the start of the block they are in until that condition is no longer true. A loop that is always true, either intentionally or not, is called an infinite loop and will run until the program is stopped, or the loop is escaped. This means to avoid infinite loops, you must change the condition in some way. Generally loops test against variables that are changed inside the loop. $variable++ (The common form for increment, or +1) would be taking a variable and adding one, so if you have a loop that tests your variable to be less than a number (say 10), adding $variable++ would repeat until it's larger (say 10 times from $variable equaling 0) Common loops are while and until.

While

While is very similar to if, generally both in syntax and function. If a condition returns true, it runs the block, and then checks again to see if the block is still true. If it is, it runs it again. This means if you don't do something to change the condition somewhere in the block, it will always be true and run infinitely. 1, since it is not an empty string, false comparative operator, or 0 would always test as true, so while (1) will perform the same action or set of actions over and over.

 while 1 is less than 10
   do this
 do this after all that

Until

Until is while's polar opposite. It tests for a false value, and runs the loop until the value tests positive. It is functionally equivilent to "while not", though is provided in many languages as a native. Taking the Goldilocks example again, you could say

 until a random number between 1 and 20 is 10
   print "I'm not eating that!" 

This would make your program keep testing random numbers until a 10 is reached, at which point you could program what would happen with the best porridge.

Iteration

For

Foreach

Iteration is similar to looping, in that it performs a task repeatedly, but iteration is usually done for each item in a set (or list). These are usually called for loops or foreach loops. In must languages, a for loop loops over each item in the list and allows the user to call it with a given name. The following example loops over list variables and prints each variable in it as variable.

 for variable in variables
   print variable

Functional Programming

Functions

See also: Functions in C++, PHP, Perl, Python, and LUA

A function contains code that will be executed whenever the function is "called." A common use is to prevent redundancy in code and keep it organized. Functions accept arguments and can return a value. This allows a function to be dynamic, for example, you might have a function that accepts two numbers and adds them together and returns the result. This might look like (in pseudo code):

 func add(integer a, integer b)
   return a + b;
 print add(1, 2);

This code would add the two arguments together (in this case a (= 1) and b (= 2)) then it adds them together and returns the result (3), the return value is then printed.

Recursion

A recursive function returns by calling itself, which is similar to a loop and when the action it is completing has finished it returns. In the following example, our recursive function solves for a factorial of n and reduces n and adds it to the total each time the function is called and returns if n is zero.

 func factorial(integer n, integer total = 0)
   total = total + n
   n     = n - 1
   if (n > 0) 
     return factorial(n,total))
   return n
 print factorial(10);

Classes and objects

See also: Objects in C++, PHP, and Python

Object oriented programming allows a programmer to create objects to simplify programming, for example a programmer might create a "cat" class and a "dog" class. Classes contain properties such as functions (known as methods when used in classes) and variables.

 class cat {
   integer size
   integer hair_color
   function purr
     // code to purr
 }
 class dog {
   integer size
   integer hair_color
   function growl
     // code to growl
 }

States

Static utility classes

Instances

Our cats and dogs each have a size and hair color variable and a purr / growl function respectively. An instance can then be instantiated and used in this fashion:

 cat kitty
 kitty.size = 3
 kitty.pur()


Extension

Objects also allow for inheritance, so a class can "inherit" all of the functions and variables from another class, so our cat and dog class could because an animal, cat, and dog class. This allows us to prevent redundant code.

 class animal {
   private:
     integer size
     integer hair_color
 }
 class cat inherits animal {
   function purr()
     // code to purr
 }
 class dog inherits animal {
   function growl()
     // code to growl
 }
 cat kitty
 kitty.size = 3
 dog puppy
 puppy.size = 3

Abstraction

A class becomes abstract when it can no longer be extended, or when child objects can no longer inherit its functionality or properties.


Scope

Objects also allow for permissions, standard convention suggests that variables should be private (meaning only the class's functions can access the variables) and functions should be public (anything can access these elements). A programmer can also create a constructor and destructor for the class (functions called when the class is instantiated or destroyed, respectively). Example:

public

private

protected

 class cat {
   private:
     integer size
     integer hair_color
   public:
     cat() { // constructor
       size = 3
     }
     ~cat() { // deconstructor
       delete size
     }
     function purr()
       // code to purr
 }

Creation, destruction, and validity

Accessors

Accessors are functions used to access private variables of a class, this allows a programmer to change aspects of the class without affecting the usage of the class.

 class cat {
   private:
     integer size
   public:
     function set_size(new_size) {
       size = new_size
     }
     function get_size() {
       return size
     }
 }

Constructors

Destructors

Referencing

See also: Referencing in Perl

Referencing allows data to be passed by reference, instead of by value. This allows programs to be much faster because data does not have to be copied for every function call. In most languages, this happens implicitly, as it is more efficient, but lower level languages, such as C, require a reference to be passed explicitly. A pointer in C is designated with a asterisk (*), and can be "dereferenced" (accessing the value at the address instead of with another asterisk. For example:

 int value = 3;
 int *ptr = &value;
 print(*ptr);

Anonymous References

Many languages have the ability to call a function by it's name rather than it's address, for example "function_name"() vs function_name(). This allows a programmer to dynamically call functions.

Functionally anonymous collection handling