Handling file extensions and drivers


hannes-fiona.groups.io@...
 

For a Fiona-based command line tool I am wondering how to best to handle user-selected output file formats and drivers.

Is there a ready-to-use way to select the driver based on the file extension? Maybe via a mapping of driver -> file extension?

E.g. if a user specified "out.geojson", the driver should be GeoJSON, if "out.shp" then use "ESRI Shapefile", etc.


Sean Gillies
 

Hi Hannes,

We don't have such a mapping in Fiona. GDAL/OGR has one tucked away inside the metadata of format drivers. See https://github.com/OSGeo/gdal/blob/master/gdal/ogr/ogrsf_frmts/geojson/ogrgeojsonseqdriver.cpp#L849 for example. It might make sense to expose GDAL's mapping in some way. A consideration: Fiona would only be able to map extensions to drivers for currently enabled drivers at runtime. One might scrape the GDAL code to make a mapping for *all* possible supported drivers, but then some of these wouldn't be available to users at runtime, depending on the GDAL distribution they've installed.

In case you hadn't seen, Python has a mimetypes module already. Its how I think things should work in a perfect world where GDAL understood mimetypes.

>>> import mimetypes
>>> mimetypes.types_map['.json']
'application/json'
>>> mimetypes.types_map['.tif']
'image/tiff'

but it doesn't have the geospatial formats because we "geo-web" people haven't bothered to register them.

>>> mimetypes.types_map['.shp']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: '.shp'
>>> mimetypes.types_map['.gpkg']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: '.gpkg'

GeoJSON *is* registered, but I think the mimetypes data needs an update.

>>> mimetypes.types_map['.geojson'] == 'application/geo+json'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: '.geojson'

On Wed, Apr 3, 2019, 11:08 AM <hannes-fiona.groups.io@...> wrote:
For a Fiona-based command line tool I am wondering how to best to handle user-selected output file formats and drivers.

Is there a ready-to-use way to select the driver based on the file extension? Maybe via a mapping of driver -> file extension?

E.g. if a user specified "out.geojson", the driver should be GeoJSON, if "out.shp" then use "ESRI Shapefile", etc.
_._,_._,_


hannes-fiona.groups.io@...
 

Thanks!

I hadn't thought of mime types at all. That would be perfect but as you say, not suitable because all the formats would need to be there.

I ended up writing a small dictionary of mappings between some popular driver names and possible file extensions (as decided by me):

```
driver_extensions = {
    "ESRI Shapefile": [".shp"],
    "GeoJSON": [".json", ".geojson"],
    "GPKG": [".gpkg"],
    "GML": [".gml", ".xml"],
}
```

Filtering the list of supported drivers to just those known ones can now be done with a fancy dict comprehension like:

```
drivers = {
    driver: modes
    for driver, modes in fiona.supported_drivers.items()
    if driver_extensions.get(driver)
}
```

And if needed a flat list of the extensions is a double list comprehension like:

```
extensions = [
    ext
    for sublist in extensions.values()
    for ext in sublist
]
```
-> `['.shp', '.json', '.geojson', '.gpkg', '.gml', '.xml']`