Palo Alto, California, United States
52K followers 500+ connections

Join to view profile

Experience & Education

  • Oscilar Inc

View Neha’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Publications

  • Building a Replicated Logging System with Apache Kafka

    Very Large Data Base Endowment Inc. (VLDB Endowment)

    Apache Kafka is a scalable publish-subscribe messaging system
    with its core architecture as a distributed commit log.
    It was originally built at LinkedIn as its centralized event
    pipelining platform for online data integration tasks. Over
    the past years developing and operating Kafka, we extend
    its log-structured architecture as a replicated logging backbone
    for much wider application scopes in the distributed
    environment. In this abstract, we will talk about our…

    Apache Kafka is a scalable publish-subscribe messaging system
    with its core architecture as a distributed commit log.
    It was originally built at LinkedIn as its centralized event
    pipelining platform for online data integration tasks. Over
    the past years developing and operating Kafka, we extend
    its log-structured architecture as a replicated logging backbone
    for much wider application scopes in the distributed
    environment. In this abstract, we will talk about our design
    and engineering experience to replicate Kafka logs for various
    distributed data-driven systems at LinkedIn, including
    source-of-truth data storage and stream processing.

    Other authors
    See publication
  • Building LinkedIn’s Real-time Activity Data Pipeline

    Bulletin of the IEEE Computer Society Technical Committee on Data Engineering

  • Building LinkedIn’s Real-time Activity Data Pipeline

    Bulletin of the IEEE Computer Society Technical Committee on Data Engineering

  • Kafka: A Distributed Messaging System for Log Processing

    NetDB 2011

    Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and…

    Log processing has become a critical component of the data pipeline for consumer internet companies. We introduce Kafka, a distributed messaging system that we developed for collecting and delivering high volumes of log data with low latency. Our system incorporates ideas from existing log aggregators and messaging systems, and is suitable for both offline and online message consumption. We made quite a few unconventional yet practical design choices in Kafka to make our system efficient and scalable. Our experimental results show that Kafka has superior performance when compared to two popular messaging systems. We have been using Kafka in production for some time and it is processing hundreds of gigabytes of new data each day.

    Other authors
    See publication

Projects

View Neha’s full profile

  • See who you know in common
  • Get introduced
  • Contact Neha directly
Join to view full profile

Other similar profiles

Explore top content on LinkedIn

Find curated posts and insights for relevant topics all in one place.

View top content

Add new skills with these courses