Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "User:Hatter/programming principles"

From NetSec
Jump to: navigation, search
(Languages)
(Languages)
Line 1: Line 1:
== Languages ==
 
 
[[Programming language]]s and miscellaneous command texts of the same nature are broken into 3 categories:
 
[[Programming language]]s and miscellaneous command texts of the same nature are broken into 3 categories:
 
* Low-to-mid-level: [[Assembly]] and [[shellcode|machine language]]s
 
* Low-to-mid-level: [[Assembly]] and [[shellcode|machine language]]s

Revision as of 18:23, 13 May 2013

Programming languages and miscellaneous command texts of the same nature are broken into 3 categories:

CC++
perlrubyphppythonlua

These three "levels" of languages differ immensely in the way they are translated into executable code in an operating system's memory. Assembly and machine languages can be classified as either mid level or low level due to the fact that both embedded code or real-mode code, also called firmware, and application code or protected-mode code, also called software can be written in machine language.

Pseudocode

Pseudocode is a human readable expression of the intended functionality of a program. It tends to consist of very high level descriptions of the code in question while leaving out aspects of programming languages that, while necessary for proper execution, are an unnecessary hindrance to a human's ability to understand the code. Pseudocode is often used in place of actual code during the initial design of a program to allow the writer to outline the function of their program and how they intend to implement it in an easy to understand manner. Pseudocode cannot be compiled nor executed; it is solely for human interpretation.

Style and organization

A critical strategy to employ for code organization is the following of previously established standards, which will, more often that not, allow for easier interpretation and debugging of written code, if not for oneself then for others. This is known as programming style, the proper use of which can be the difference between a bad hacker, a good hacker, and an amazing hacker.

The vast majority of programming languages offer the ability to comment one's code through use of a special syntax. It is important that one uses comments to document one's code in any place where the code can be perceived as not being perfectly clear, with "perfectly" being the operative word. It is always useful to think "If I returned to this code in 5 years, would I be able to quickly understand exactly what it was doing?".

Variables are something that one will deal with constantly in one's programming activities. Variables are "names" given to some piece of data and, like the term implies, can vary during the course of one's program. Satisfactory programming style incorporates informative variable names. For example, if one has created a program that took input from a user and changed the first letter of their name, it would be bad style to name that variable "apple", as opposed to "userInput". The ideal is written code that is self-documenting, the organization and variable naming scheme being so good that the final product requires very few comments to explain its execution.

The primary benefit of good programming style is that clear organization of one's program will often prove invaluable in debugging. One will generally be able to spot errors at a much faster rate, compared to someone who was careless in crafting their program (resulting in the dreaded "spaghetti code"). Proper use of programming style also makes it easier to update one's code if patching any exploits/vulnerabilities becomes necessary.

There are sometimes minute customary programming practices which are specific to a particular language (in Java it is customary to use camelCase to name variables, while in Python one may see variables written as name_of_var), but, like any other stylistic practices, they may be either ignored, violated, or entirely absent in any given program.

It's best to use existing practices within whatever application is being written, the next best alternative being to develop a standard before allowing others to contribute to one's own code.


Variables and Data Types

See also: Variables in LUA, Ruby, PHP, Perl, Python, C, and C++

Variables, as the name implies, are dynamic user-defined items used in programming, which change based on the context of the code being written, such as the programming language used or the actions of the application.

Low-Level (Machine)

See also: binary data, registers, memory addressing

Any given piece of data that the processor can perform operations on is binary represented in hexadecimal that does not exceed the size of a register for the given instruction set architecture.

  • If a register is b bits, this means the highest value that can be placed into a register is 2 multiplied by itself b times minus 1, or:
 2^b - 1
  • Conversely, the highest bit set in any given value can be calculated with :
log(value)/log(2) - (log(value)/log(2) % 1)

The reason for subtracting one after calculating the power of two representing b is that in binary, zero counts as a number. This is the reason that programmers count from zero, loops are initialized at zero, etc.

Values are typically referred to in groups of bits:

 term      bits      bytes     
nibble      4         1/2
 byte       8          1 
 word      16          2
dword      32          4
qword      64          8

Memory addresses may contain offsets, virtual addresses, or absolute addresses. These addresses can usually be edited or read in increments of b. On some x86 instruction set architectures, one or more registers can be appended to obtain a larger value.

Mid-Level

In mid-level languages, variable types are broken up further into integers, strings, character arrays, floating point, booleans, and more. In these languages, arrays are simply constructs of a particular type (non-mixed) and Associative Arrays or hashes are actually user-defined data types.

Integer:

  • A whole number, positive or negative, an integer is four bytes.

Floating point:

  • A number with a decimal point, floating point numbers are four bytes.

Character:

  • A single byte text character, e.g. "h"

String:

  • Multiple text characters, e.g. "hello world!", in most languages strings are null terminated, meaning that they end in \x00.

Boolean:

  • Can only be true or false. Some languages preprocess these as binary 1 or 0 (one bit), respectively.

References:

  • Also known as pointers, references store the address of data in memory.

High-Level

In high-level languages, there are generally three types of variables: Scalars (single values), Arrays (or lists), and Associative Arrays (or hashes).

Scalars:

  • A scalar is the simplest form of a variable, as it is a single object. For example, one may recall a person's name or a single number. These are typically used for basic things like running the same equation (think X + 10 =) when you'll need changing data to test.

Arrays:

  • An array is a bit more complex than a Scalar. Arrays are formed from multiple scalars and can contain scalars or predefined, static strings. One may have a variable for a classroom, which inside lists all the names rather than one.

Associative Arrays:

  • An associative array is similar to an Array, but with two fields that compare each other instead of individual strings. For instance one might have an array that compares the classes' first names to to their last names, such that at any point if one knows the person's last name, they can find their first name.

"Mixed" Variables:

  • The true benefit of variables comes when one combines them. One may have a scalar called number to recall an element in array called months to pull a month's name by its chronological number, or use a scalar called last_name that refers to your hash called full_names so that every time one recalls last_name it associates the first name.

Data operations

Arithmetic

See also: Bitwise math and arithmetic in C++ and Perl
  • TODO: Some notes about bitwise operations
  • TODO: Explanation of "Modulus" and "Integer Division"

Casting and Type Conversions

See also:

Sometimes variables need to be accessed as a different type than they are; for example, an integer as a string. This can be done by typecasting.

 string str1 = (string)3

Control Flow

See also: Control Flow in C++, Python, PHP, and Perl

Control flow is anything that affects the path a program follows as it runs, examples are conditionals and loops.

Conditional Validity

Validity is a simple concept. If a condition is true, the conditional or loop will act accordingly. Likewise, if a condition is false (a 0, empty string, or incorrect comparative operator), it will treat the loop or conditional in the opposite manner.

Comparative Operators

Comparative operators are used when one wants to return a true value by comparing two other values. For example, one might have a 7 and a 5. Since they're not empty strings, (7) would always be true, (5) would likewise always be true, but, for example, would 7 less than 5 be true?? Common comparative operators are and, or, equal to, not equal to, greater than, greater than or equal to, less than, and less than or equal to. The operators return true when the mathematical condition is satisfied.

Comparative Operators and Symbols
Literal Shorthand Symbolic
And and &&
Or or ||
Equal to eq ==
Not Equal to ne !=
Greater than gt >
Greater than or equal to ge >=
Less than lt <
Less than or equal to le <=


Conditional statements

See also: Conditional statements in C, C++, PHP, Perl, Python, and Ruby

Typically found as if, unless, else, or else if, conditionals take action according to the validity of a condition. That is to say, they either do an action contained in the block of code, or skip past the block altogether depending on the conditional used.

If

The most common conditional is if. If your condition returns true, for instance:

 if (5 < 7) or if (5 != 7)

It proceeds with the block and performs actions. If it returns false, it ignores the block altogether and goes to the next.

Unless

Unless is the exact opposite of if. It only continues with the block if the condition isn't met and returns a false value. It is functionally equivalent to "if not", but is provided as a native in many languages.

 unless ($variable < $limit)) 
   do something

This conditional can be reworded into an if, and one should always use what feels more natural to them ("if exists $variable" is the same as "unless not exists $variable").

Else

Else is plain conditional in that it doesn't test anything except the previous conditional. For instance one could have a statement that reads

 unless this variable is less than a limit
   do this
 else
   print "Help, the ceiling's collapsing!" 

If the previous condition is not satisfied, it will always run. If it is, it will never run.

Else if

Else if, or more commonly elseif is exactly what it sounds like: A combination of if and else. It tests the previous condition and then tests a new condition if the previous one was false. A good example out of Goldilocks and the Three Bears is:

 if $temperature is greater than 10
   print "This porridge is too hot." 
 elseif $temperature is less than 10 
   print "This porridge is too cold." 
 else 
   print "This porridge is just right!" 

Only if the first condition isn't already met, it moves on to test the next one. If the next one isn't met, either, the else at the end assumes itself true.

Switch

Switch statements are similar to regular conditionals, except that it deals with different values of a single variable. Each conditional in a switch statement is referred to as a "case" statement and the default result, if none of the cases match, is called "default". Each case should be terminated with a break. For example:

 switch (a){
   case 1:
     do_this()
     break
   case 2:
     do_this_2()
     break
   default:
     default()
 }

Loops

Basic Loops

See also: Loops in C, C++, Perl, PHP, Python and Ruby

Loops are very similar to conditionals in that they test a condition in order to function; unlike conditionals, instead of continuing with the program, they go back to the start of the block they are in until that condition is no longer true. A loop that is always true, either intentionally or not, is called an infinite loop and will run until the program is stopped, or the loop is escaped. To avoid infinite loops, one must change the condition in some way. Generally loops test against variables that are changed inside the loop. $variable++ (The common form for increment, or +1) would be taking a variable and adding a value one; for example, if one has a loop that tests its variable to be less than a number (say 10), adding $variable++ would repeat until it's larger (say 10 times from $variable equaling 0). Common examples of loops are while and until.

While

While is very similar to if, generally both in syntax and function. If a condition returns true, it runs the block, and then checks again to see if the block is still true. If it is, it runs it again. This means if one doesn't do something to change the condition somewhere in the block, it will always be true and run infinitely. 1, since it is not an empty string, false comparative operator, or 0, would always test as true, so while (1) will perform the same action or set of actions repeatedly.

 while 1 is less than 10
   do this
 do this after all that

Until

Until is While's polar opposite. It tests for a false value, and runs the loop until the value tests positive. It is functionally equivalent to "while not", though is provided in many languages as a native. Taking the Goldilocks example again, one could say

 until a random number between 1 and 20 is 10
   print "I'm not eating that!" 

In this example, the program keeps testing random numbers until a 10 is reached, at which point one could program what would happen with the best porridge.

Iteration

For

Foreach

Iteration is similar to looping, in that it performs a task repeatedly, but iteration is usually done for each item in a set (or list). These are usually called for loops or foreach loops. In must languages, a for loop loops over each item in the list and allows the user to call it with a given name. The following example loops over list variables and prints each variable in it as variable.

 for variable in variables
   print variable

Functional Programming

Functions

See also: Functions in C++, PHP, Perl, Python, and LUA

A function contains code that will be executed whenever the function is "called." Functions can be used to prevent redundancy in code and keep it organized. Functions accept arguments and can return a value: this allows a function to be dynamic. For example, one might have a function that accepts two numbers and adds them together and returns the result. This might look like (in pseudo code):

 func add(integer a, integer b)
   return a + b;
 print add(1, 2);

This code would add the two arguments together (in this case a (= 1) and b (= 2)) then it adds them together and returns the result (3), the return value is then printed.

Recursion

A recursive function returns by calling itself, similarly to a loop, and when the action it is completing has finished it returns. In the following example, our recursive function solves for a factorial of n and reduces n and adds it to the total each time the function is called and returns if n is zero.

 func factorial(integer n, integer total = 0)
   total = total + n
   n     = n - 1
   if (n > 0) 
     return factorial(n,total))
   return n
 print factorial(10);

Classes and objects

See also: Objects in C++, PHP, and Python

Object oriented programming allows a programmer to create objects to simplify programming, for example a programmer might create a "cat" class and a "dog" class. Classes contain properties such as functions (known as methods when used in classes) and variables.

 class cat {
   integer size
   integer hair_color
   function purr
     // code to purr
 }
 class dog {
   integer size
   integer hair_color
   function growl
     // code to growl
 }

States

Static utility classes

Instances

Our cats and dogs each have a size and hair color variable and a purr / growl function, respectively. An instance can then be instantiated and used in this fashion:

 cat kitty
 kitty.size = 3
 kitty.pur()


Extension

Objects also allow for inheritance, so a class can "inherit" all of the functions and variables from another class, so our cat and dog class could become an animal, cat, and dog class. This allows us to prevent redundant code.

 class animal {
   private:
     integer size
     integer hair_color
 }
 class cat inherits animal {
   function purr()
     // code to purr
 }
 class dog inherits animal {
   function growl()
     // code to growl
 }
 cat kitty
 kitty.size = 3
 dog puppy
 puppy.size = 3

Abstraction

A class becomes abstract when it can no longer be extended, or when child objects can no longer inherit its functionality or properties.


Scope

Objects also allow for permissions. Standard convention suggests that variables should be private (meaning only the class's functions can access the variables) and functions should be public (anything can access these elements). A programmer can also create a constructor and destructor for the class (functions called when the class is instantiated or destroyed, respectively). Example:

public

private

protected

 class cat {
   private:
     integer size
     integer hair_color
   public:
     cat() { // constructor
       size = 3
     }
     ~cat() { // deconstructor
       delete size
     }
     function purr()
       // code to purr
 }

Creation, destruction, and validity

Accessors

Accessors are functions used to access private variables of a class, this allows a programmer to change aspects of the class without affecting the usage of the class.

 class cat {
   private:
     integer size
   public:
     function set_size(new_size) {
       size = new_size
     }
     function get_size() {
       return size
     }
 }

Constructors

Destructors

Referencing

See also: Referencing in Perl

Referencing allows data to be passed by reference, instead of by value. This allows programs to be much faster, as data does not have to be copied for every function call. In most languages, this happens implicitly, as it is more efficient. However, lower level languages, such as C, generally require a reference to be passed explicitly. A pointer in C is designated with a asterisk (*), and can be "dereferenced" (accessing the value at the address) with another asterisk. For example:

 int value = 3;
 int *ptr = &value;
 print(*ptr);

Anonymous References

Many languages have the ability to call a function by its name rather than its address, for example "function_name"() vs function_name(). This allows a programmer to dynamically call functions. These are also called anonymous function calls.

Functionally anonymous collection handling