Exploring Log ASCII Standard files using Python and Streamlit
LAS files are standard and simple ways to transfer and store well-log and/or petrophysical data within the oil and gas industry. The format was developed in the late 80s and early 90s by the Canadian Well Logging Society as a way to standardise and organise digital log information. LAS files are essentially structured ASCII files that contain multiple sections with information about the well and data from it; as such, they can be readily viewed within a typical text editor like Notepad or TextEdit.
Streamlit is one of my favourite Python libraries for creating quick and easy to use dashboards or interactive tools. It is also great if you want to create an app where you or the end user doesn’t have to worry about code. Therefore, within this article, we are going to see how we can use Streamlit to build a data explorer app for LAS files.
If you want to see the full app in action, check out the short video below.
Or explore the source code on GitHub:
If you want to see how to work with LAS files in Python, then the following articles may be of interest:
The first part of our app will involve importing the required libraries and modules.
These are:
After importing these libraries, we can add a line at the end to set the page width to be a full page and change the app’s title in the browser window.
import streamlit as st
import lasio
import pandas as pd
from io import StringIO# Plotly imports
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.express as px
st.set_page_config(layout="wide", page_title='LAS Explorer v.0.1')
To check Streamlit is working, we can run the following command in the terminal:
streamlit run app.py
Which will open up a browser window with a blank Streamlit app.
The first piece of code we are going to add to this app is a call to st.sidebar
. This will create a column on the left-hand side of the app, and we will use this to store our navigation menu and file uploader widget.
st.sidebar.write('# LAS Data Explorer')
st.sidebar.write('To begin using the app, load your LAS file using the file upload option below.')
We can use st.sidebar.write
to add a few messages and instructions for the end user. In this example, we will keep it relatively simple with the app name and a message on how to get started.
Once the sidebar is in place, we can start implementing the file uploader piece of our code.
las_file=Noneuploadedfile = st.sidebar.file_uploader(' ', type=['.las'])
las_file, well_data = load_data(uploadedfile)
if las_file:
st.sidebar.success('File Uploaded Successfully')
st.sidebar.write(f'<b>Well Name</b>: {las_file.well.WELL.value}',
unsafe_allow_html=True)
To do this, we need to call upon st.file_uploader
. We will also restrict the file types to just .las files. To make this more useful, we may want to include the capitalised version of the extension as well.
Next, we will call upon the load data function, which we will come to shortly. This function will be set up to return las_file
as a lasio las file object and well_data
as dataframe containing the well log measurements.
Following that, we will check if we have a las file. If it is set to None
then nothing will happen; however, if the file has been loaded successfully through the load_data
function, then it will not be None
and therefore execute the code underneath it.
The code within the if function essentially displays a coloured callout followed by the well name of the las file.
Before we run the Streamlit app, we need to create the load_data
function. This will allow us to read the data and generate the lasio las file object and pandas dataframe.
@st.cache
def load_data(uploaded_file):
if uploaded_file is not None:
try:
bytes_data = uploaded_file.read()
str_io = StringIO(bytes_data.decode('Windows-1252'))
las_file = lasio.read(str_io)
well_data = las_file.df()
well_data['DEPTH'] = well_data.indexexcept UnicodeDecodeError as e:
st.error(f"error loading log.las: {e}")
else:
las_file = None
well_data = None
return las_file, well_data
When we run the Streamlit LAS Data Explorer app, we will see our sidebar on the left along with the file uploader widget.
We can then click on Browse Files and search for a las file.
Once that file has been loaded, we will see the green callout saying the file was loaded successfully, followed by the well name contained within the file.
When someone launches the LAS Data Explorer app for the first time, it would be great to display the app’s name and a brief description of what it does.
st.title('LAS Data Explorer - Version 0.2.0')
st.write('''LAS Data Explorer is a tool designed using Python and
Streamlit to help you view and gain an understanding of the contents
of a LAS file.''')
st.write('\n')
When we rerun the app, we will now see our home page. This could be expanded to include extra instructions, details about the app and how to get in touch if there is a problem.
When building a Streamlit app, it is good practice to split code up into functions and call them at the appropriate time. This makes the code more modular and easier to navigate.
For our home page, we will place the above code into a function called home()
.
def home():
st.title('LAS Data Explorer - Version 0.2.0')
st.write('''LAS Data Explorer is a tool designed using Python and
Streamlit to help you view and gain an understanding of the contents
of a LAS file.''')
st.write('\n')
When building Streamlit apps, it is very easy to fall into the trap of continuously adding sections one after the other resulting in a long scrollable web page.
One way to make Streamlit apps more navigable is by adding a navigation menu. This allows you to split content over multiple pages.
One way to achieve this is to use a series of radio buttons, which, when toggled, will change the content displayed on the main part of the app.
First, we need assign a title for our navigation section, and then we have to call upon st.sidebar.radio
and pass in a list of pages we want the user to be able to navigate to.
# Sidebar Navigation
st.sidebar.title('Navigation')
options = st.sidebar.radio('Select a page:',
['Home', 'Header Information', 'Data Information',
'Data Visualisation', 'Missing Data Visualisation'])
When we run the app, we will see that we now have a navigation menu represented by the radio buttons.
At the moment, if you click on the buttons, nothing will happen.
We need to tell Streamlit what to do when a selection is made.
This is achieved by creating an if/elif statement like the one below. When an option is selected, then a specific function will be called.
For example, if the user has Home selected, then the home function that was created earlier will be displayed.
if options == 'Home':
home()
elif options == 'Header Information':
header.header(las_file)
elif options == 'Data Information':
raw_data(las_file, well_data)
elif options == 'Data Visualisation':
plot(las_file, well_data)
elif options == 'Missing Data Visualisation':
missing(las_file, well_data)
Let’s begin implementing the other sections to start displaying some content.
Within each las file, there is a section at the top that contains information about the well. This includes Well Name, Country, Operator and much more.
To read this information, we will create a new function called header
and then loop through each row within the header.
To prevent errors when the user clicks on the Header Information radio button, we need to check if a las file object has been created during the loading process. Otherwise, we present the user with an error.
Then for each header item, we will display the descriptive name ( item.descr
), the mnemonic ( item.mnemonic
) and the associated value ( item.value
).
def header(las_file):
st.title('LAS File Header Info')
if not las_file:
st.warning('No file has been uploaded')
else:
for item in las_file.well:
st.write(f"<b>{item.descr.capitalize()} ({item.mnemonic}):</b> {item.value}",
unsafe_allow_html=True)
When the app is rerun, and the Header Information page is selected from the Navigation menu, we will now see the relevant well information.
After the header information has been successfully read, we next want to look at what well log measurements are contained within the las file.
To do this, we will create a simple function called raw_data
which will:
- go through each measurement within the las file and write out it’s mnemonic, unit and description
- provide a count of the total number of measurements present
- create a statistical summary table for each measurement using the
describe
method from pandas - create a data table with all of the raw values
This is a lot for a single function to do and could benefit from being tidied up, but for this simple app, we will keep it all together.
def raw_data(las_file, well_data):
st.title('LAS File Data Info')
if not las_file:
st.warning('No file has been uploaded')
else:
st.write('**Curve Information**')
for count, curve in enumerate(las_file.curves):
st.write(f" {curve.mnemonic} ({curve.unit}): {curve.descr}",
unsafe_allow_html=True)
st.write(f"<b>There are a total of: {count+1} curves present within this file</b>",
unsafe_allow_html=True)st.write('<b>Curve Statistics</b>', unsafe_allow_html=True)
st.write(well_data.describe())
st.write('<b>Raw Data Values</b>', unsafe_allow_html=True)
st.dataframe(data=well_data)
When the Streamlit app is rerun, we will see all of the information relating to the well log measurements.
First, we have the well measurement information and associated statistics.
Followed by the raw data values.
As with any dataset, it can be hard to get a handle on what the data looks like by analysing the raw numbers. To take things to the next level, we can use interactive plots.
These will make it easier for the end user to get a better understanding of the data.
The following code generates multiple plots on a Streamlit page. It is all contained within a single function for ease of use within this app. Remember, each function represents a page within the LAS Data Explorer app.
To save having to use multiple pages, the code below will generate three expanders for three different plots: a line plot, a histogram and a scatter plot (also known as a cross plot within Petrophysics).
def plot(las_file, well_data):
st.title('LAS File Visualisation')if not las_file:
st.warning('No file has been uploaded')
else:
columns = list(well_data.columns)
st.write('Expand one of the following to visualise your well data.')
st.write("""Each plot can be interacted with. To change the scales of a plot/track, click on the left hand or right hand side of the scale and change the value as required.""")
with st.expander('Log Plot'):
curves = st.multiselect('Select Curves To Plot', columns)
if len(curves) <= 1:
st.warning('Please select at least 2 curves.')
else:
curve_index = 1
fig = make_subplots(rows=1, cols= len(curves), subplot_titles=curves, shared_yaxes=True)
for curve in curves:
fig.add_trace(go.Scatter(x=well_data[curve], y=well_data['DEPTH']), row=1, col=curve_index)
curve_index+=1
fig.update_layout(height=1000, showlegend=False, yaxis={'title':'DEPTH','autorange':'reversed'})
fig.layout.template='seaborn'
st.plotly_chart(fig, use_container_width=True)
with st.expander('Histograms'):
col1_h, col2_h = st.columns(2)
col1_h.header('Options')
hist_curve = col1_h.selectbox('Select a Curve', columns)
log_option = col1_h.radio('Select Linear or Logarithmic Scale', ('Linear', 'Logarithmic'))
hist_col = col1_h.color_picker('Select Histogram Colour')
st.write('Color is'+hist_col)
if log_option == 'Linear':
log_bool = False
elif log_option == 'Logarithmic':
log_bool = True
histogram = px.histogram(well_data, x=hist_curve, log_x=log_bool)
histogram.update_traces(marker_color=hist_col)
histogram.layout.template='seaborn'
col2_h.plotly_chart(histogram, use_container_width=True)
with st.expander('Crossplot'):
col1, col2 = st.columns(2)
col1.write('Options')
xplot_x = col1.selectbox('X-Axis', columns)
xplot_y = col1.selectbox('Y-Axis', columns)
xplot_col = col1.selectbox('Colour By', columns)
xplot_x_log = col1.radio('X Axis - Linear or Logarithmic', ('Linear', 'Logarithmic'))
xplot_y_log = col1.radio('Y Axis - Linear or Logarithmic', ('Linear', 'Logarithmic'))
if xplot_x_log == 'Linear':
xplot_x_bool = False
elif xplot_x_log == 'Logarithmic':
xplot_x_bool = True
if xplot_y_log == 'Linear':
xplot_y_bool = False
elif xplot_y_log == 'Logarithmic':
xplot_y_bool = True
col2.write('Crossplot')
xplot = px.scatter(well_data, x=xplot_x, y=xplot_y, color=xplot_col, log_x=xplot_x_bool, log_y=xplot_y_bool)
xplot.layout.template='seaborn'
col2.plotly_chart(xplot, use_container_width=True)
Once the above code has been implemented, we can see that we have the LAS File Visualisation page with three expandable boxes.
Within geoscience and petrophysics, we often plot data on line plots — often referred to as log plots. The y-axis often represents the depth along a wellbore and the x-axis representing the data we want to visualise. This allows us to visualise trends and patterns within these measurements with depth easily.
Within the Log Plot section, we can select specific columns from the dataframe and display them in the interactive Plotly chart.
Histograms show the data distribution and allow us to contain a large amount of data within a small and concise plot.
Within the Histogram section, we have a few basic options. We can select a column from the dataframe to display and decide whether we want that displayed linearly or logarithmically.
Finally, we have the option to use the colour picker from Streamlit. This allows you to choose the colour for the histogram and can enhance your visualisation for presentations and reports.
Scatter plots (crossplots) are commonly used within petrophysics and data science to compare two variables. From this type of graph, we can understand if there is a relationship between the two variables and how strong that relationship is.
Within the Crossplot section of the Data Visualisation page, we can select x and y axis variables, as well as a third variable, to colour code the data.
Finally, we can set the x and y axes to linear scale or logarithmic scale.
Missing data is one of the most common data quality issues we face when working with datasets. It can be missing for a multitude of reasons ranging from sensor failure to improper and possible careless data management.
When working with datasets, it is essential that missing data is identified and the root cause behind that data being missing is understood. A proper understanding of why data is missing is key to developing pragmatic solutions on how to deal with the missing data, especially as many machine learning algorithms are incapable of handling missing values.
Within Python, we could use the textual data summaries provided by the pandas describe
function. Whilst this is useful, it often helps to visualise missing data values on graphs. This allows us to easily identify patterns and relationships that may not be obvious with text based summaries.
To create interactive plots of data completeness, we can leverage the Plotly library. The code below sets up the Missing Data Visualisation page within the LAS Data Explorer app.
First, we check if we have a valid las file; if we do, we start creating the page with some explanatory text.
Next, we give the user an option to select all data within the dataframe or select specific columns. To the right of this, we allow the user to change the colour of the bars in the charts.
Then we move onto to plotting the data based on the user selection.
def missing(las_file, well_data):
st.title('LAS File Missing Data')if not las_file:
st.warning('No file has been uploaded')
else:
st.write("""The following plot can be used to identify the depth range of each of the logging curves.
To zoom in, click and drag on one of the tracks with the left mouse button.
To zoom back out double click on the plot.""")
data_nan = well_data.notnull().astype('int')
# Need to setup an empty list for len check to work
curves = []
columns = list(well_data.columns)
columns.pop(-1) #pop off depth
col1_md, col2_md= st.columns(2)
selection = col1_md.radio('Select all data or custom selection', ('All Data', 'Custom Selection'))
fill_color_md = col2_md.color_picker('Select Fill Colour', '#9D0000')
if selection == 'All Data':
curves = columns
else:
curves = st.multiselect('Select Curves To Plot', columns)
if len(curves) <= 1:
st.warning('Please select at least 2 curves.')
else:
curve_index = 1
fig = make_subplots(rows=1, cols= len(curves), subplot_titles=curves, shared_yaxes=True, horizontal_spacing=0.02)
for curve in curves:
fig.add_trace(go.Scatter(x=data_nan[curve], y=well_data['DEPTH'],
fill='tozerox',line=dict(width=0), fillcolor=fill_color_md), row=1, col=curve_index)
fig.update_xaxes(range=[0, 1], visible=False)
fig.update_xaxes(range=[0, 1], visible=False)
curve_index+=1
fig.update_layout(height=700, showlegend=False, yaxis={'title':'DEPTH','autorange':'reversed'})
# rotate all the subtitles of 90 degrees
for annotation in fig['layout']['annotations']:
annotation['textangle']=-90
fig.layout.template='seaborn'
st.plotly_chart(fig, use_container_width=True)
When we visit this page of the LAS Data Explorer, we are presented with an interactive Plotly chart, as seen below. if the user has selected “All Data”, then all the columns will show.
If a user has selected “Custom Selection” then they can select the columns directly from the dataframe.
Check out my article below if you want to see other ways of identifying missing values using Python:
Within this article, we have seen how to build an app using Streamlit and Python for exploring LAS files. Whilst this is a basic app, it can provide a useful alternative to looking at raw LAS files within a text editor. Additional functionality could be added to edit the files or convert them to another standard format. The possibilities are endless!
The data used within this tutorial is a subset of the Volve Dataset that Equinor released in 2018. Full details of the dataset, including the licence, can be found at the link below.
The Volve data license is based on CC BY 4.0 license. Full details of the license agreement can be found here: