Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build GeoIPISP legacy database #44

Open
malakudi opened this issue Oct 5, 2024 · 5 comments
Open

Build GeoIPISP legacy database #44

malakudi opened this issue Oct 5, 2024 · 5 comments

Comments

@malakudi
Copy link

malakudi commented Oct 5, 2024

Trying to build a GeoIPISP legacy database. I copied code from the ASN related functions (ASNRadixTree etc) but resulting file is not valid. Field of input csv called isp and parsed correctly but output file is wrong. What am I missing? Can you help?

Here is my patch so far

--- ../100/geolite2legacy/pygeoip_const.py	2024-10-05 10:05:47.870249802 +0300
+++ ../geolite2legacy/pygeoip_const.py	2024-10-05 10:22:07.172732546 +0300
@@ -405,6 +405,7 @@
 CITY_EDITION_REV1_V6 = 30
 ORG_EDITION = 5
 ISP_EDITION = 4
+ISP_EDITION_V6 = 22
 ASNUM_EDITION = 9
 ASNUM_EDITION_V6 = 21
 NETSPEED_EDITION = 10
@@ -416,7 +417,7 @@
 
 
 # Collection of databases
-IPV6_EDITIONS = (COUNTRY_EDITION_V6, ASNUM_EDITION_V6, CITY_EDITION_REV1_V6)
+IPV6_EDITIONS = (COUNTRY_EDITION_V6, ASNUM_EDITION_V6, ISP_EDITION_V6, CITY_EDITION_REV1_V6)
 CITY_EDITIONS = (CITY_EDITION_REV0, CITY_EDITION_REV1, CITY_EDITION_REV1_V6)
 REGION_EDITIONS = (REGION_EDITION_REV0, REGION_EDITION_REV1)
 REGION_CITY_EDITIONS = REGION_EDITIONS + CITY_EDITIONS
--- ../100/geolite2legacy/geolite2legacy.py	2024-10-05 10:05:47.854249430 +0300
+++ ../geolite2legacy/geolite2legacy.py	2024-10-05 11:01:57.014387404 +0300
@@ -241,6 +241,29 @@
     reclen = STANDARD_RECORD_LENGTH
     segreclen = SEGMENT_RECORD_LENGTH
 
+class ISPRadixTree(RadixTree):
+    seek_depth = 31
+    edition = ISP_EDITION
+    reclen = STANDARD_RECORD_LENGTH
+    segreclen = SEGMENT_RECORD_LENGTH
+
+    def gen_nets(self, locations, infile):
+        for row in csv.DictReader(infile):
+            nets = [IPNetwork(row['network'])]
+            isp = decode_text(row['isp'])
+            entry = u'{}'.format(isp)
+            yield nets, (serialize_text(entry),)
+
+    def encode(self, data):
+        return data + b'\0\0\0'
+
+
+class ISPv6RadixTree(ISPRadixTree):
+    seek_depth = 127
+    edition = ISP_EDITION_V6
+    reclen = STANDARD_RECORD_LENGTH
+    segreclen = SEGMENT_RECORD_LENGTH
+
 
 class CityRev1RadixTree(RadixTree):
     seek_depth = 31
@@ -366,13 +389,15 @@
 RTree = {
     'Country': {'IPv4': CountryRadixTree, 'IPv6': Countryv6RadixTree},
     'City': {'IPv4': CityRev1RadixTree, 'IPv6': CityRev1v6RadixTree},
-    'ASN': {'IPv4': ASNRadixTree, 'IPv6': ASNv6RadixTree}
+    'ASN': {'IPv4': ASNRadixTree, 'IPv6': ASNv6RadixTree},
+    'ISP': {'IPv4': ISPRadixTree, 'IPv6': ISPv6RadixTree}
 }
 
 Filenames = {
     'Country': {'IPv4': "GeoIP.dat", 'IPv6': "GeoIPv6.dat"},
     'City': {'IPv4': "GeoIPCity.dat", 'IPv6': "GeoIPCityv6.dat"},
-    'ASN': {'IPv4': "GeoIPASNum.dat", 'IPv6': "GeoIPASNumv6.dat"}
+    'ASN': {'IPv4': "GeoIPASNum.dat", 'IPv6': "GeoIPASNumv6.dat"},
+    'ISP': {'IPv4': "GeoIPISP.dat", 'IPv6': "GeoIPISPv6.dat"},
 }
 
 
@@ -423,8 +448,9 @@
     # noinspection PyUnboundLocalVariable
     datfilecomment = '{} converted to legacy MaxMind DB with geolite2legacy'.format(os.path.dirname(entry.filename))
     dbtype, entries = entries.popitem()
+    print(dbtype)
 
-    if dbtype == 'ASN':
+    if dbtype == 'ISP' or dbtype == 'ASN':
         locs = None
     else:
         if not {'Locations', 'Blocks'} <= set(entries.keys()):
@hege-li
Copy link

hege-li commented Oct 5, 2024

@malakudi Atleast you need to change STANDARD_RECORD_LENGTH -> ORG_RECORD_LENGTH (edit: sorry meant ORG)

And possibly this (not sure if \0\0\0 -> \0 change was required for isp/org but try it)
def encode(self, data):
return data + b'\0'

@malakudi
Copy link
Author

malakudi commented Oct 5, 2024

@malakudi Atleast you need to change STANDARD_RECORD_LENGTH -> GEO_RECORD_LENGTH

And possibly this (not sure if \0\0\0 -> \0 change was required for isp/org but try it) def encode(self, data): return data + b'\0'

GEO_RECORD_LENGTH is not defined in GeoIP.h and of course at pygeoip_const.py. Maybe you mean ORG_RECORD_LENGTH?

@malakudi
Copy link
Author

malakudi commented Oct 5, 2024

It worked with reclen = ORG_RECORD_LENGTH
Thank you very much!
I will create a pull request later

@malakudi
Copy link
Author

malakudi commented Oct 5, 2024

@malakudi Atleast you need to change STANDARD_RECORD_LENGTH -> ORG_RECORD_LENGTH (edit: sorry meant ORG)

And possibly this (not sure if \0\0\0 -> \0 change was required for isp/org but try it) def encode(self, data): return data + b'\0'

Database seems correct with both \0\0\0 and \0, at least when checked with geoiplookup utility. Same for ASNum db. Why \0\0\0 was needed?

@malakudi
Copy link
Author

malakudi commented Oct 5, 2024

OK, I reply to myself, when you do not use \0\0\0 , the version string (shown in geoiplookup with -v) is empty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants