Uber open-sourced their M3 metrics platform, and while skimming through how it works, I found an approach called LTTB downsampling. Here's what it is all about... LTTB stands for Largest Triangle Three Buckets, which is a downsampling algorithm used to query and visualize millions of time series data points efficiently. The approach keeps the visual shape of the plot while reducing the number of points required to render it. This makes the frontend efficient with fewer data points to render and less data to be emitted over the network. It works by dividing the time series into buckets and selecting points that preserve the largest visual area. So, instead of random sampling or simple averaging, LTTB picks points that maintain the peaks, valleys, and overall trend of the data. The way it does this is by considering three points at a time (previous, current, next) and calculating the area of the triangle they form. The point that creates the largest triangle gets selected. This means important visual features survive the downsampling. With this approach, the system can downsample 10,000 points to 500 and still get a chart that looks nearly identical to the original. Beyond M3, systems like Grafana, InfluxDB, and Prometheus have all adopted LTTB or similar downsampling techniques. It is super interesting that something as seemingly simple as "which points should we keep?" turns into such an interesting problem statement.
Data Visualization Software
Explore top LinkedIn content from expert professionals.
-
-
Best LLM-based Open-Source tool for Data Visualization, non-tech friendly CanvasXpress is a JavaScript library with built-in LLM and copilot features. This means users can chat with the LLM directly, with no code needed. It also works from visualizations in a web page, R, or Python. It’s funny how I came across this tool first and only later realized it was built by someone I know—Isaac Neuhaus. I called Isaac, of course: This tool was originally built internally for the company he works for and designed to analyze genomics and research data, which requires the tool to meet high-level reliability and accuracy. ➡️Link https://lnkd.in/gk5y_h7W As an open-source tool, it's very powerful and worth exploring. Here are some of its features that stand out the most to me: 𝐀𝐮𝐭𝐨𝐦𝐚𝐭𝐢𝐜 𝐆𝐫𝐚𝐩𝐡 𝐋𝐢𝐧𝐤𝐢𝐧𝐠: Visualizations on the same page are automatically connected. Selecting data points in one graph highlights them in other graphs. No extra code is needed. 𝐏𝐨𝐰𝐞𝐫𝐟𝐮𝐥 𝐓𝐨𝐨𝐥𝐬 𝐟𝐨𝐫 𝐂𝐮𝐬𝐭𝐨𝐦𝐢𝐳𝐚𝐭𝐢𝐨𝐧: - Filtering data like in Spotfire. - An interactive data table for exploring datasets. - A detailed customizer designed for end users. 𝐀𝐝𝐯𝐚𝐧𝐜𝐞𝐝 𝐀𝐮𝐝𝐢𝐭 𝐓𝐫𝐚𝐢𝐥: Tracks every customization and keeps a detailed record. (This feature stands out compared to other open-source tools that I've tried.) ➡️Explore it here: https://lnkd.in/gk5y_h7W Isaac's team has also published this tool in a peer-reviewed journal and is working on publishing its LLM capabilities. #datascience #datavisualization #programming #datanalysis #opensource
-
What does a real-time credit card transaction stream look like in practice? One of my Coaching participants built exactly that: an impressive OLTP streaming system on AWS. Fully serverless. Fully orchestrated. And built around real-world use cases like fraud detection and transaction monitoring. And this is how it works: Data comes in through an API, flows into Kinesis, lands in an S3 bucket, and gets picked up by a Lambda function that writes it into a structured Postgres database (RDS). From there, QuickSight takes over and visualizes the data for reporting and analysis. CI/CD? All done via GitHub Actions and CDK. 👉 Link to Faiz Puad's GitHub in the comments to learn more! I’m sharing this to show what’s possible with the right tools, a solid architecture, and the willingness to build something end-to-end. If you want to build projects like this yourself: This is exactly the kind of work we cover in my Coaching program. Practical, hands-on, and job-relevant projects, with expert mentorship all along the way. 🤝 Check it out via the link in the comments!
-
When trending more than one series in a line chart 📈, you have four different options for visualizing the information. I thought it would be helpful to review the strengths and weaknesses of each approach. SINGLE CHART Strengths: When you plot all your data points in a single line chart, it’s going to be the easiest for direct comparisons between the series. It’s also going to be the most space-efficient option. Weaknesses: If the data series have vastly different scales or units, they can become confusing or misleading (dual axes). When there are many data series, the overlapping trends can become cluttered, and key values can be obscured by other series. SIDE-BY-SIDE CHARTS Strengths: Each line chart can have a unique scale, which can reduce potential confusion. Separating the time series into separate charts makes it easier to interpret each chart individually with less clutter or overlapping data. With side-by-side line charts, the y-axis of each line chart doesn’t need to be compacted, which may be useful with certain datasets. Weaknesses: Having the time series side-by-side makes it harder to compare specific points in time across the two charts. Because the layout requires more horizontal space, both charts will need to be reduced in size (x-axes), making the labels and features of the data smaller and harder to read. STACKED CHARTS Strengths: Each line chart can have its own unique scale. With the line charts stacked on each other, it’s easier for the audience to compare the data sequentially and at specific points in time. Weaknesses: With the line charts stacked on each other, there will be less vertical space, which will compact the y-axis of each line chart. This treatment will flatten the data and make the trends harder to see. SMALL MULTIPLES Strengths: Separating the time series into individual charts makes it easier to identify and compare patterns and variations across multiple variables at a high-level. Weaknesses: To make the most sense, all multiples must have consistent scales. To gain a high-level perspective of patterns, you’re sacrificing the ability to compare specific values directly. When designing a data scene with trended data, you’ll need to consider what message you’re trying to convey to your audience. How you approach it will depend on the dataset and what you’re trying to communicate or emphasize. For example, if you have weekly sales results for several stores, you might use a small multiple to highlight the similar or diverging patterns between them. If you were comparing the impact of a change to your HR benefits across two distinct metrics, you might use a stacked chart approach to show how the change impacted one metric but not the other. What other pros or cons do you consider when choosing your line chart approach? 🔽 🔽 🔽 🔽 🔽 Craving more of my data storytelling, analytics, and data culture content? Sign up for my brand new newsletter today: https://lnkd.in/gRNMYJQ7
-
📢 Power BI vs Tableau : Which Data Visualization Tool is Right for You? I worked in both power bi and tableau tools in different projects requirements. Power BI and Tableau stand as titans in the realm of business intelligence and data visualization, each offering distinct advantages tailored to different business needs. This comparison will help you decide which of these tools to use for your data science and analytics needs. The main differences between them are: Power BI, with its seamless integration with Microsoft products and user-friendly interface, proves advantageous for organizations heavily invested in the Microsoft ecosystem. On the other hand, Tableau boasts unparalleled data visualization capabilities and advanced analytics features, making it a preferred choice for data-driven enterprises requiring sophisticated insights. Power BI: ➡️ Developed by Microsoft ➡️ More affordable pricing options, with a free version and lower-cost Pro version ➡️ Strong integration with other Microsoft products, such as Excel and Azure ➡️ Emphasizes ease of use, with a user-friendly interface and simplified data modeling ➡️ Offers real-time collaboration features Tableau: ➡️ Developed by Tableau Software ➡️ Generally more expensive, with a free version (Public) but requiring more advanced licenses for enterprise use ➡️ Offers a wider range of advanced data visualization options and a more powerful data engine ➡️ Emphasizes data discovery and exploration, with robust data blending and data mapping capabilities ➡️ Offers robust mobile and web authoring options for easy sharing of insights and data visualizations. Happy Learning 😃 ! Any key points you would like to add? Let's discuss! Follow Nirav Prajapati for more posts related to #DataAnalytics and #DataScience. #powerbi #datavisualization #tableau #dataanalytics
-
Noisy data makes trends hard to identify. Being unable to see trends causes two major issues: 1. Misguided decisions (not seeing the forest through the trees) 2. Wasted Resources (variation is normal, not every change needs an answer) So, how do we shed light on patterns in noisy data? Stock prices are a great example of noisy data. Stocks generate new data every nanosecond, so, analysts use two key methods to draw out their trends over the course of days, weeks, months, & years. 1. Smoothing This is easily accomplished by placing a moving average on top of the time series data, creating a smooth line that considers a larger set of data to off set near term price fluctuations. 2. Changing Time Frequency Looking at data by the nanosecond is not feasible for a human, so we naturally compress time to hours, days, week, etc. so that we can see how the data performed in those time period. Stock prices typically use the candlestick charts below that show min, max, and the open/close prices. Smoothing and zooming out in time are some of the best ways to handle noisy data, especially in time series data. Below is a basic example you can play with in Python to get a better sense of how these work 👇 -------- import pandas as pd import numpy as np import matplotlib.pyplot as plt # Generate a hypothetical stock price dataset for 6 months (approximately 180 days) np.random.seed(0) dates = pd.date_range(start="2023-01-01", periods=180) prices = np.random.normal(0.5, 0.75, size=180).cumsum() + 100 # Create a DataFrame stock_data = pd.DataFrame(data={'Price': prices}, index=dates) # Apply a 30-day moving average for smoothing stock_data['30D_MA'] = stock_data['Price'].rolling(window=30).mean() # Reduce the time frequency to weekly, taking the last price of the week # Can change to 'M', 'Q', or 'D' for differing time frequencies weekly_data = stock_data.resample('W').last() # Plotting plt.figure(figsize=(14, 7)) plt.plot(stock_data['Price'], label='Daily Prices', alpha=0.5) plt.plot(stock_data['30D_MA'], label='30-Day Moving Average', linewidth=2) plt.plot(weekly_data.index, weekly_data['Price'], label='Weekly Prices', marker='o', linestyle='-', linewidth=2) plt.title('Stock Price with Smoothing and Time Frequency Reduction') plt.xlabel('Date') plt.ylabel('Price') plt.legend() plt.grid(True) plt.show() ------ Hi I'm Joe 👋 I work with young medtech, healthcare, & life science companies to help them understand their data and win in the market. #dataanalytics #dataanalyst #datavisualization #powerbi #tableau
-
Building a real time Dashboard with Data Engineering skills. Ever wondered how those mesmerizing real-time dashboards come to life? Buckle up, data enthusiasts, because we're diving deep into the engineering magic behind them! 🪄 📌The journey of building a real-time dashboard, from data acquisition to visualization. The tools that make it tick, like: ✅Data pipelines: Python and Airflow orchestrate the smooth flow of data from various sources. ️ ✅Streaming technologies: Apache Kafka ensures real-time updates, keeping your dashboard always fresh. ⚡ ✅Visualization libraries: D3.js and Plotly paint the data into insightful charts and graphs. 🔖But it's not just about tools! We'll also explore the thought process behind: 👉Defining key metrics: What data truly tells the story you want to convey? 👉Designing for clarity: How to present information effectively and avoid visual overload? 👉Ensuring performance: Keeping the dashboard snappy even with constant data updates. ⚡️ Share your favorite real-time dashboards and why they rock! #dataengineering #dataanalytics #datavisualization #kafka #data
-
🚀 Power BI: Adding Custom Tooltips to KPI Cards! While exploring Power BI capabilities, I discovered a creative workaround to add detailed tooltips to KPI cards - something that's not natively supported! The Challenge 🎯 Standard KPI cards in Power BI lack the ability to show comprehensive tooltip information, limiting user interaction and data exploration. My Solution 💡 I developed a custom HTML-based approach that transforms regular KPI cards into interactive, information-rich components: ✅ HTML-Wrapped DAX Measures: Converted standard measures into HTML format for enhanced styling ✅ Dual Theme Support: Implemented both light and dark modes for better user experience ✅ Dynamic Theming: Created a separate theme table with all color variations ✅ Rich Tooltips: Added detailed breakdowns including Monthly Recurring Revenue, One-time Sales, and Refunds Key Features 🌟 Responsive Design: Adapts to different screen sizes Theme Consistency: Seamless light/dark mode switching Enhanced UX: Detailed information on hover without cluttering the main view Professional Styling: Clean, modern card design with gradients and shadows Implementation Highlights 🔧 // Theme table with comprehensive color schemes Theme = DATATABLE( "ThemeMode", STRING, "ColorType", STRING, "ColorValue", STRING, // Light & Dark theme definitions... ) Important Consideration ⚠️ While this solution works effectively, be mindful of the HTML card container's width as it can overlap with other visuals. Proper positioning and sizing are crucial for optimal user experience. Results 📊 ✨ Enhanced user engagement with interactive KPI cards ✨ Better data storytelling through rich tooltips ✨ Professional, modern dashboard appearance ✨ Improved accessibility with theme options This approach opens up new possibilities for creating more engaging and informative Power BI dashboards. Sometimes the best solutions come from thinking outside the box! Share your experiences in the comments! #PowerBI #DataVisualization #BusinessIntelligence #DAX #HTML #DataAnalytics #Innovation #Dashboard #Microsoft #DataScience
-
Day 1 of #BusinessIntelligence Ever wondered why some dashboards make an impact while others confuse users? Here are 5 essential principles that I always follow when building dashboards: • Know Your Audience: Understand the decisions they need to make. • Prioritize KPIs: Focus on the most critical metrics. • Simplicity is Key: Clutter can distract, so aim for clarity. • Consistent Design: Maintain a consistent format, color scheme, and chart types. • Iterate and Improve: Gather feedback and continually refine your dashboard. I’ve applied these principles to a recent project where simplifying a complex dashboard led to higher user engagement and clearer insights. By understanding user needs and removing non-essential data, I turned it into an actionable tool. What’s the one principle you never skip when building dashboards? #BusinessIntelligence #DashboardDesign #DataVisualization #PowerBI #Tableau #DataAnalysis
-
This month’s Power BI update is quite exciting... 🤓 The PBI Core Visuals team is working on some pretty cool stuff lately. Let’s dive into the details. 1️⃣ Marker Enhancements: Advanced controls for markers make data points pop: ↳ Customize by Categories or Series: Control marker styles at the category or series level. ↳ Marker Visibility Toggles: Toggle markers on/off for specific categories or series. ↳ Marker Shape & Transparency Control: Personalize markers by adjusting shapes (rotations supported, except circles) and sizes. ↳ Customizable Marker Borders: adjustable color, transparency, and width. 2️⃣ Small Multiples for Card Visuals: Compare data across categories or dimensions with ease: ↳ Flexible Layout Options: Choose Single Column, Single Row, or Grid layouts. ↳ Overflow Handling: Use pagination or continuous scrolling to manage excess data. ↳ Advanced Styling Controls: Customize borders, gridlines, and background colors. Round corners for a modern look. ↳ Header & Title Customization: Control header settings and adjust titles for font, color, padding, and text wrap. Align with your report’s branding. 3️⃣ New Text Slicer: Enhance data filtering with text-based searches: ↳ Intuitive Text Filtering: Type into the input box to filter data in real-time. ↳ Comprehensive Appearance Customization: Configure the input box with placeholder text, font, color, and transparency. ↳ Enhanced Button Controls: Adjust Apply button settings for color, transparency, borders, and padding. Customize the Dismiss button for clearing filters. ↳ Focus Accent Bar & Borders: Highlight the active input field with an accent bar. Set borders around the input area. Excited about these new features? I for sure am 🚀 * Note 1: Some features might still be in development. * Note 2: All images used are from Microsoft as unfortunately I don't have the latest version of Power BI on my laptop yet * Note 3: Links to all the updates details in the comments! #data #datapears #powerbi #report #reporting #dataviz #datavisualization #news