Skip to main content

Scripts for sampling Geo data sets by the specific region name

Project description

Geo sampling: Randomly sample locations on streets

CI PyPI version Documentation Downloads

Say you want to learn about the average number of potholes per kilometer of street in a city. Or estimate a similar such quantity. To estimate the quantity, you need to sample locations on the streets. This package helps you sample those locations. In particular, the package implements the following sampling strategy:

Sampling Strategy

1. Sampling Frame

Get all the streets in the region of interest from OpenStreetMap. To accomplish that, the package first downloads administrative boundary data for the country in which the region is located in ESRI format from http://www.gadm.org/country. The administrative data is in multiple levels, for instance, cities are nested in states, which are nested in countries. The user can choose a city or state, but not a portion of a city. And then the package uses the pyshp package to build a URL for the site http://extract.bbbike.org from which we can download the OSM data.

2. Sampling Design

  • For each street (or road), starting from one end of the street, we split the street into .5 km segments till we reach the end of the street. (The last segment, or if the street is shorter than .5km, the only segment, can be shorter than .5 km.)

  • Get the lat/long of starting and ending points of each of the segments. And assume that the street is a straight line between the .5 km segment.

  • Next, create a database of all the segments

  • Sample rows from the database and produce a CSV of the sampled segments

  • Plot the lat/long --- filling all the area within the segment. These shaded regions are regions for which data needs to be collected.

3. Data Collection

Collect data on the highlighted segments.

Installation

Prerequisites

The package requires Python 3.10 or higher. Install the package from PyPI:

pip install geo-sampling

For development installation:

git clone https://github.com/geosensing/geo_sampling.git
cd geo_sampling
pip install -e .

Usage

The package provides two main CLI commands:

geo_roads

Process geographic regions to extract road segments:

geo_roads -c Singapore -n North -l 1

sample_roads

Sample from processed road segments:

sample_roads input_roads.csv output_sample.csv --sample-size 100

Documentation

For complete documentation, visit the project documentation page.

🔗 Adjacent Repositories

Authors

Suriyan Laohaprapanon and Gaurav Sood

License

Scripts are released under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

geo_sampling-0.2.2.tar.gz (17.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

geo_sampling-0.2.2-py3-none-any.whl (13.6 kB view details)

Uploaded Python 3

File details

Details for the file geo_sampling-0.2.2.tar.gz.

File metadata

  • Download URL: geo_sampling-0.2.2.tar.gz
  • Upload date:
  • Size: 17.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geo_sampling-0.2.2.tar.gz
Algorithm Hash digest
SHA256 a6a1c404d7724cc534974c3bdc00f2419a2ab7973dde64d46ff1b16e621cf2c0
MD5 f65327aba560942443104973e34635d0
BLAKE2b-256 eb20cce0f72f9576e666b144611073765ba48b844cfdd84a08b592e7d334debf

See more details on using hashes here.

Provenance

The following attestation bundles were made for geo_sampling-0.2.2.tar.gz:

Publisher: python-publish.yml on geosensing/geo_sampling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file geo_sampling-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: geo_sampling-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 13.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for geo_sampling-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3b3b946df963715280c6246fe7d76c26908599d4557ab61885b05b5bbca73d2d
MD5 4cb69a07db4ea9cb04561dc587685a95
BLAKE2b-256 e1c94573cfbf1f0bc06c762c256dc6b8950aa181c0127e1c444b3edf3cf05ee2

See more details on using hashes here.

Provenance

The following attestation bundles were made for geo_sampling-0.2.2-py3-none-any.whl:

Publisher: python-publish.yml on geosensing/geo_sampling

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page