First I would check if you have a sufficient amount of RAM available. If not, this could explain the slow performance.
If this is the case, I would recommend to read, process and write the data in batches.
Otherwise, there are a lot of parameters that can impact the performance. E.g. how complex the geometries are, how many rows you want to write, how many parallel reads and write you have to the disk, etc.
Regarding geometries problems, I'm not entirely sure what you mean. But regardless, with big datasets, it's always a good option to debug with smaller datasets (e.g. the first thousand lines) and then test if everything works.
And fully unrelated, I would recommend to use os.path.join(datadir, "data.shp") instead of data_dir+"data.gpkg"