Install GeoLite2 into your mySQL Database

Many times in determining the usefulness of some proxy server, you need to know where that proxy server is located at. There are plenty of websites online that will show you the precise geographic location of any IP address, but very few will let you do any type of mass-querying for large number of IP addresses. So to do that, we need to get ahold of some "ip-to-location" database and do all that querying from our own computer/server. One of the most popular products available that provides such functionality is the GeoIP2 by MaxMind.

The database we're interested is their GeoLite2 City in its CSV Format that can be downloaded from here.

wget http://geolite.maxmind.com/download/geoip/database/GeoLite2-City-CSV.zip
unzip GeoLite2-City-CSV.zip

Download it and unzip it and you should get a directory named GeoLite2-City-CSV_20160503 or something similar to that with a bunch of CSV files in it. Next step is preparing to import all that data into our database.

We first create two tables inside our database: "ip_blocks" and "ip_locations". List of all IP address data columns and their datatypes is described at GeoIP2 City and Country CSV Databases page on their website. All the code below is based off of that data structure at the time of this writing which might have changed since then.

CREATE TABLE `ip_blocks` (

`ip_from` int unsigned NOT NULL,
`ip_to` int unsigned NOT NULL,

  `network` varchar(32) NOT NULL,
  `geoname_id` int unsigned NOT NULL,
  `registered_country_geoname_id` int unsigned NOT NULL,
  `represented_country_geoname_id` int unsigned NOT NULL,
  `is_anonymous_proxy` tinyint(1) NOT NULL,
  `is_satellite_provider` tinyint(1) NOT NULL,
  `postal_code` varchar(32) NOT NULL,
  `latitude` float(8,4) NOT NULL,
  `longitude` float(8,4) NOT NULL,
  `accuracy_radius` smallint unsigned NOT NULL
);
and
CREATE TABLE `ip_locations` (
  `geoname_id` int unsigned NOT NULL,
  `locale_code` varchar(32) NOT NULL,
  `continent_code` char(2) NOT NULL,
  `continent_name` varchar(32) NOT NULL,
  `country_iso_code` char(2) NOT NULL,
  `country_name` varchar(64) NOT NULL,
  `subdivision_1_iso_code` varchar(3) NOT NULL,
  `subdivision_1_name` varchar(128) COLLATE 'utf8_unicode_ci' NOT NULL,
  `subdivision_2_iso_code` varchar(3) NOT NULL,
  `subdivision_2_name` varchar(128) COLLATE 'utf8_unicode_ci' NOT NULL,
  `city_name` varchar(128) COLLATE 'utf8_unicode_ci' NOT NULL,
  `metro_code` smallint unsigned NOT NULL,
  `time_zone` varchar(64) NOT NULL
);

The next step is to actually import data into those two tables. The preferred way of doing that is through a terminal because importing millions of rows could take a while.

LOAD DATA LOCAL INFILE '/root/geoip/GeoLite2-City-CSV_20160503/GeoLite2-City-Blocks-IPv4.csv' INTO TABLE ip_blocks COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' IGNORE 1 LINES (
@network,
geoname_id,
registered_country_geoname_id,
represented_country_geoname_id,
is_anonymous_proxy,
is_satellite_provider,
postal_code,
latitude,
longitude,
accuracy_radius) SET 
ip_from = INET_ATON(SUBSTRING(@network, 1, LOCATE('/', @network) - 1)),
ip_to = (INET_ATON(SUBSTRING(@network, 1, LOCATE('/', @network) - 1)) + (pow(2, (32-CONVERT(SUBSTRING(@network, LOCATE('/', @network) + 1), UNSIGNED INTEGER)))-1));

If you get the "The used command is not allowed with this MySQL version" error, you need to be running mySQL with the --local-infile option:

mysql --local-infile -u root -p your_database_name

Now we load the data into ip_locations table. Again, the query code below assumes that you want all columns to be included in the table. If you do not wish to store a particular field in your database, then add @ before each of those columns you don't want.

LOAD DATA LOCAL INFILE '/root/geoip/GeoLite2-City-CSV_20160503/GeoLite2-City-Locations-en.csv' INTO TABLE ip_locations CHARACTER SET UTF8 COLUMNS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' IGNORE 1 LINES (
geoname_id,
locale_code,
continent_code,
continent_name,
country_iso_code,
country_name,
subdivision_1_iso_code,
subdivision_1_name,
subdivision_2_iso_code,
subdivision_2_name,
city_name,
metro_code,
time_zone);

If all went successful, both tables should now be populated with the data from those CSV files. Next step is adding indexes to make querying a lot faster.

ALTER TABLE `ip_blocks` ADD PRIMARY KEY `ip_to` (`ip_to`);
ALTER TABLE `ip_locations` ADD PRIMARY KEY `geoname_id` (`geoname_id`);

And finally, we add this helper procedure to simplify our future queries.

DELIMITER $$

CREATE FUNCTION `IP2Location`(`ip` varchar(50))
	RETURNS int(11)
	LANGUAGE SQL
	DETERMINISTIC
	CONTAINS SQL
	SQL SECURITY DEFINER
	COMMENT ''
BEGIN

DECLARE loc_id INT;

SELECT geoname_id INTO loc_id FROM ip_blocks WHERE ip_to >= INET_ATON(TRIM(ip)) ORDER BY ip_to LIMIT 1;

RETURN IFNULL(loc_id, 0);

END $$

Usage:

SELECT * FROM ip_locations WHERE geoname_id = IP2Location("198.199.81.169");