User:Hatter/getting started
So you're new to offensive security, and one day you want to call yourself a hacker. Understanding the building blocks of a system is the first step towards learning to control it. A solid basis in administration is needed in order to know how to use a machine. A solid basis in programming will help you understand what information gathering leads to successful exploitation and maintaining access. While countermeasures do get in the way, most can be evaded or bypassed with an intermediate knowledge of programming.
Contents
Administration
Administration can be broken into a few categories, but for the purposes of this library, administration is divided into system administration, and network administration.
Mastery of an Operating System is essential. Most servers on the internet are powered by Linux. While difficult, a head-first approach to learning Linux can be obtained with Gentoo Installation. We crawl before we learn to walk. Mastery of the basics of file manipulation, diagnostic tools and the like in the Linux environment will make you much more efficient when using Linux to do anything - be it rooting a box or web exploitation - so it is advised that you check out the Bash book, which will familiarize you with the commands that are essential to efficiently using a Linux system.
Protecting yourself on the Internet is essential, although you have already taken the first step by using a Linux box - especially if you chose to use Gentoo. In order to protect yourself from malicious packets, see the article on Iptables for filtering incoming packets and dropping those that are potentially malicious in nature. It is also a good idea to check the Anonymity article for tips on how to keep your identity secret on the Internet - depending on just what you intend to do, these measures can be as simple or as complex as you desire.
Code
Programming is the next essential skill. While it is possible to perform exploitation on an application without any knowledge of the language it is written in, understanding of the language allows for a deeper understanding of the way the application you are exploiting handles input and processes data - if you understand what makes it work, you will understand what makes it stop working in the way you want it to. If you want to go "head-first", start with a lower level language like assembly and move on to compiled languages, finishing with interpreted languages. If you prefer the easier, or "feet-first" approach, interpreted languages are the best place to start, then work towards assembly - perhaps learning one of the compiled languages in between.
Assembly and machine code are the building blocks of all other programming languages. Machine code is what most people think of when they refer to "binary code" (though it is more often represented as hexadecimal opcodes), while assembly is a system of mnemonic words to make machine code easier to work with - for example, "\xcd\x80" comes "int $0x80".
These languages are the predecessors to the C language, a mid-level compiled language which became the cornerstone for nearly all of the modern interpreted languages, including PHP, Perl, Python, and Ruby. The Linux operating system is written in C and C++. It is advisable to become familiar with C and at least learn enough assembly to understand how your C code is compiling. Learning interpreted languages such as PHP and Perl can also be useful for their flexibility and power.
When starting out with programming, it is important to avoid the kinds of mistakes that lead to vulnerabilities in your code - such as unsafe string replacement and other design flaws. Not only does learning about these vulnerabilities prevent your own code from being exploited, but the better you understand the potential pitfalls of a language, the better you can exploit those same pitfalls.
The biggest priority to the aspiring hacker and beginning programmer is to just start programming. Practice makes perfect and the sooner you start, the faster you will become a competent coder. Thankfully, it is an amazing time for people new to programming as there are now many resources for learning how to code. A good place to check out is Codecademy due to the fact that it is very beginner friendly and will get you on the road to learning essential tools in your hacking career like JavaScript and Python.
Programming Style
A critical strategy to keep in mind is the practice of keeping your code organized and to follow the standards put forth by coders who came before you. This is known as programming style and can spell the difference between a good hacker and an amazing hacker.
The vast majority of programming languages offer the ability to comment your code by using special syntax. It is important that you use comments to document your code, it's useful to think "If I returned to this code in 20 years to improve it, would I be able to tell what it was doing?".
Variables are something that you will deal with constantly in your programming activities, these are names given to some piece of data and like the name implies can vary during the life of your program. Good programming style incorporates useful variable names. In other words, if you had a program that took input from a user then changed the first letter of their name it would be bad style to name that variable "apple" as opposed to "userInput". Something to strive for is code that is self-documenting, the organization and variable naming being so good that it requires few comments to explain it's execution.
A major benefit of keeping good programming style is that the clear organization of your program will aid in spotting errors at a much faster rate than someone who was careless in crafting their program (resulting in the dreaded "spaghetti code"). Not only does it help in debugging but, it will also make it easier to update your code to patch any exploits/vulnerabilities. There exist many other minute programming styles which are specific to the language you are learning (in Java it is customary to use camelCase to name variables while in Python you will see variables written as name_of_var) so it is best to talk with the community in order to discover best practices.
Information Gathering
Social Engineering
Social Engineering is a method of extracting information from targets by means of social/psychological manipulation for the purpose of gaining unauthorized access to desired targets later on. Techniques often rely on the follies of human nature such as our tendency to easily trust others and help our fellow man.
An example scenario would be if an individual wanted to gain access to a targeted financial firm, utilizing social engineering the individual may wear a cast and crutches to approach the door just as an employee were walking out. It is reasonable to assume that if any of us were in this position we would kindly hold the door open for this person. The financial firm has just been compromised and is now vulnerable to any number of attacks. The door may have been secured with a sophisticated access card lock but it is useless against this kind of social engineering technique. The weakest link in any security set up are the users, this is a fact that social engineering exploits.
Exploitation
The best way to learn exploitation is with a solid basis in programming. The best place to start is usually with exploiting a programming language that you are familiar with. If you are familiar with interpreted languages, web exploitation is the best place to begin; whereas if you are familiar with compiled languages, binary exploitation is the best step for a beginner. It is also best to start with an environment already familiar to you.
Web exploitation
Most beginners find web exploitation to be the easiest topic to start with. This requires a strong understanding of the world wide web. Web applications are programmed using a series of interpreted languages. This nearly always involves some form of HTML and CSS, originally developed to be a document and that document's stylesheet. Dynamic content is usually powered by a database, and usually involves SQL code. The programming languages used to render dynamic content are interpreted on the web server, while languages such as HTML, CSS, and JavaScript are interpreted and rendered by the client.
Web exploitation can be used to execute remote commands, steal cookies, extract database information, bypass authentication, plant database-powered privilege escalation backdoors and more. Simply because exploitation of interpreted languages is easier than exploitation of compiled languages does not make it any less effective. This, in conjunction with the recent popularity of web applications makes it the best place to begin. We've also developed a series of web exploitation tools to assist beginners in remedial tasks.
Binary exploitation
Exploitation of compiled languages used to be much easier than it is today. Due to countermeasures like DEP, ASLR, and IPS applications/devices, binary exploitation is becoming more and more difficult. To perform a filter bypass on a modern Operating System, the shellcode or machine code used during a buffer overflow exploit must be crafted to bypass all of the restrictions in place. Beginners usually start learning to write shellcode with a fundamental knowledge of assembly. Once an understanding of assembly for the respective operating system is obtained, null-free shellcode is usually the first type of shellcode written by a beginner. It is even possible to write printable and polymorphic alphanumeric shellcode and ascii shellcode for IDS evasion. The Bleeding Life project contains shellcode which utilizes return oriented programming in order to bypass ASLR and DEP countermeasures for vulnerable software running on the windows 7 operating system.
Network exploitation
Network exploitation requires a solid understanding of network administration and network protocols.
Maintaining access
There are several ways to maintain access, and I will attempt to cover a few of the basics here. The first such would be using innocent looking process names (init , [kthreadd], httpd, sshd, etcetera). This is to mislead administrators to hopefully overlook the process.
Another such would be a lkm rootkit, on older kernels these can be an ideal way to maintain a backdoor in to the system. An important thing to remember is that anti-rootkit technologies for linux based systems such as chkrootkit and rkhunter are merely shellscripts. These can be easily modified to even remove checks for whatever public rootkit that you may choose as well as removing the checksum checks on its own script. If it is a newer kernel you may wish to check in to something like jynx rootkit as this will compile on most servers.