Questions about this topic? Sign up to ask in the talk tab.

Unsafe substring indexing

From NetSec
Jump to: navigation, search

String positioning functions are often inadvertantly misused in a wide variety of languages when determining where a pattern exists within a string.

The vulnerability

The problem is that ruby and perl's index(), php's strpos(), or python's find() may return 0 or something else that equates to false if the search character or string is at position 0 in the string (starting with the first letter at 0). Because 0 evaluates to null or false, the conditions will not be met even if the needle exists in the haystack. Often times these functions are used to determine whether or not a string exists within a string (rather than regular expressions or other string comparison operators) due to the performance benefits. The downside is that careless implementations result in vulnerabilities.

  if (strpos('/', $var)) ...
  if (var.find('/')): ...
  if (var.index('/')) ...
  if (index($var,'/')) ...


Because of the simplicity of the problem, it is trivial to add additional boundary checks are added. While this can be a bit language specific, they are all easily remembered. In PHP, multiple '=' operators are used to ensure that strpos() is returning an integer zero rather than a null value (adding an implicit type-check).

  if (strpos("/",$var) >== 0) ...
  if (var.find("/") >= 0): ...
  if (var.index('/') >= 0) ...
  if (index($var,'/') >= 0) ...
c3el4.png Some perl, ruby, or python implementations may require 'gte' rather than '>=' as the "greater-than or equals" comparison operator.

Protip: Some lanuages will require you to properly enforce type (scalar vs. array) and may have unpredictable behaviors if an array is passed (e.g. var[1]=foo&var[2]=bar in the url.)


Using a simple bash command, one can search for all of the occurances (with context) of these functions:


localhost:~ $ find -name \*.py -exec grep -HnC2 \\.find\( '{}' \; \

-o -name \*.php -exec grep -HnC2 str.\?pos\( '{}' \; \ -o -regextype posix-awk -regex ".*\.(rb|pl|pm)" \

-exec grep -HnC2 \\bindex\( '{}' \; &> string_locating.txt

It is not hard to isolate vulnerable implementations from there:


localhost:~ $ grep \\.php:[0-9]\\+: string_locating.txt \ |grep -v "\!==\|==="