Geologist

Dataset description

The dataset consists of 20 different categorical and continuous parameters of 513 oil deposit. There are 72 blanks in one categorical and one continuous parameter, which must be filled.

Tasks

Provide an approach to most accurately fill in missing values.
Provide an approach to assess the reliability of parameter values (anomaly detection).
Provide a method to generate oil deposit synthetic data.

Why these tasks are important?

Accurate estimation of porosity is vital for improvement of oil/gas recovery, selection of cost-effective production schemes, optimization of well placement, developing geothermal energy etc. So correct models on this stage can save huge amounts of money for a company.
Reservoir’s data is very rare and new data is very expensive to obtain. When we don’t have enough data, we want to generate plausible data.
When we have complete data, we want to know how much can we trust this data. Because mistakes in the oil industry are very expensive.

Thus it’s complete universal pipeline for preprocessing oil reservoir’s data.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
Geologist.pdf		Geologist.pdf
README.md		README.md
anomaly-detection.ipynb		anomaly-detection.ipynb
bdeu_samples.csv		bdeu_samples.csv
bdeu_samples_2.csv		bdeu_samples_2.csv
data-generation.ipynb		data-generation.ipynb
data-intro.ipynb		data-intro.ipynb
img.jpg		img.jpg
k2_samples.csv		k2_samples.csv
k2_samples_2.csv		k2_samples_2.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Geologist

Dataset description

Tasks

Why these tasks are important?

About

Releases

Packages

Languages

pacifikus/Geologist

Folders and files

Latest commit

History

Repository files navigation

Geologist

Dataset description

Tasks

Why these tasks are important?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages