(Note: I also ran the tests with a deepcopy of the input for each test created with py() to make sure that I wasn't gaining advantages of already-processed data, but the timings still went down, not up, for the 100 -> 1000 rows case). So at 1000 rows, vectorising takes mere milliseconds still and is taking less time because we are now hitting optimisations used for larger datasets, but using the time taken to run df.apply() on those 1000 rows has ballooned to over 4 seconds. %timeit martijnpieters_vectorised(df1000) If you increase the count to 1000 the numbers the difference becomes much more apparent: df1000 = random_world_coords(1000) That's 100 items, and vectorising is about 4.5 times faster. # Generate the column names, grouped by componentĬomps = ( for a in 'DMS')ĭms_to_dec(*(df.values for c in comps))Īt which point you can test how fast either on is with IPython's %timeit or another benchmarking library: df100 = random_world_coords(100) Return d + sign * m / 60 + sign * s / 3600 Sign = np.where(np.signbit(d), np.ones_like(d) * -1.0, np.ones_like(d)) # and positive zero are distinguished correctly. However, if it was preserved as np.NZERO, this function will Note that for -0d Mm Ss inputs, the sign might be have been lost! Handles signs only present on the D column, transparently. """convert d, m, s coordinates to decimalsĬan be used as a vectorised operation on whole numpy arrays, Using geopandas, numpy and shapely: import geopandas as gpdįrom shapely.geometry import asMultiPoint I'm sticking to the same convention here, input is a Pandas DataFrame df that contains columns lonD, lonM, lonS, latD, latM and latS. You can do this by using vectorised calculations (where numpy applies calculations to all rows using very fast arythmatic operations directly on the machine representation of the data, and not on the Python reprentations). And you'd want to do this fast and efficiently. So, given that you have degrees, minutes and seconds, you need to convert those values to decimal coordinates to feed to geopandas. Note that you don't have to create a dataframe here, a GeoSeries is enough if all you want is conversion. The project accepts CRS strings to define what coordinate system to use when interpreting points (of which the geodetic datum is a component) using a coordinate system database such as to find the EPSG codes for NAD83 and NAD27, gives us EPSG:4269 and EPSG:4267, respectively. That said, geopandas can be used to transform between different geodetic datums. They are simply a standardised way of specifying what coordinate system to use and what point of reference the coordinate system is anchored to. On the other hand, NAD83 and NAD27 are not geodetic datums or geodetic systems, and such systems are notation agnostic. GeoPy wants to work with decimal values, so you do need to fold the arc-seconds and arc-minutes into the degrees value. The decimal is another, which combines the degrees value with the arc minutes and arc seconds into a single number an arc minute is 1/60th of a degree, and an arc second is 1/3600th of a degree, so you can do a little math to sum the values together (preserving the sign of the degree). Just to be explicit: the D, M, S coordinate notation is just that, a different way to note latitude and longitude coordinates, where D, M, and S stand for degrees, (arc) minutes and (arc) seconds. If not, the sign might be lost and points just south of the equator or west of the zero-th meridian will appear as just north or east instead. If that's the case and you are lucky the DataFrame was created using numpy floats, then "-0.0" has probably been stored as numpy.NZERO (negative zero), in which case we can still recover the sign using numpy.signbit(). Moreover, we need to account for the fact that DMS columns may only have included the - sign on the D column. Because I ran into this too, and found the df.apply() approach too slow, I switched to using a MultiPoint() object and used vectorised operations, then turning that single object into Point()s with list().
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |