London Area, United Kingdom
5K followers 500+ connections

Join to view profile

About

I am a 𝗗𝗮𝘁𝗮 𝗣𝗹𝗮𝘁𝗳𝗼𝗿𝗺 𝗔𝗿𝗰𝗵𝗶𝘁𝗲𝗰𝘁 𝗮𝗻𝗱 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴…

Activity

Join now to see all activity

Experience & Education

  • XR Extreme Reach

View Ankit’s full experience

See their title, tenure and more.

or

By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.

Projects

  • Apache Falcon (Commiter)

    -

    Falcon is a feed processing and feed management system aimed at making it easier for end consumers to onboard their feed processing and feed management on hadoop clusters.

    See project
  • Apache Hive (Commiter)

    -

  • Apache Lens (Contributer)

    -

    Lens provides a Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple-tiered data stores and an optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.

    At a high level, the project provides these features -

    Simple metadata layer which provides an abstract view over tiered data stores
    Single shared schema server based on the…

    Lens provides a Unified Analytics interface. Lens aims to cut the Data Analytics silos by providing a single view of data across multiple-tiered data stores and an optimal execution environment for the analytical query. It seamlessly integrates Hadoop with traditional data warehouses to appear like one.

    At a high level, the project provides these features -

    Simple metadata layer which provides an abstract view over tiered data stores
    Single shared schema server based on the Hive Metastore - This schema is shared by data pipelines (HCatalog) and analytics applications.
    OLAP Cube QL which is a high-level SQL like language to query and describe data sets organized in data cubes.
    A JDBC driver and Java client libraries to issue queries, and a CLI for ad hoc queries.
    Lens application server - a REST server which allows users to query data, make schema changes, schedule queries and enforce quota limits on queries.
    The driver-based architecture allows plugging in reporting systems like Hive, Columnar data warehouses, Redshift etc.
    Cost-based engine selection - allows optimal use of resources by selecting the best execution engine for a given query based on the query cost.

    See project
  • WebCrawler-Pagerank

    -

    Given a root URL and depth limit the crawler starts the crawling by fetching the page at the root URL and processing it. The processing of the page is basically composed of 2 tasks:
    Calculating the page rank (more on that in the next section).
    Crawling (fetching and processing) its statically linked pages until the depth limit is reached.

Languages

  • English

    Full professional proficiency

  • Hindi

    Full professional proficiency

More activity by Ankit

View Ankit’s full profile

  • See who you know in common
  • Get introduced
  • Contact Ankit directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Add new skills with these courses