Skip to main content
0 votes
0 answers
83 views

I am running an Apache Spark job on Amazon EMR that needs to connect to an Amazon MSK cluster configured with IAM authentication. The EMR cluster has an IAM role with full MSK permissions, and I can ...
Vishwas Singh's user avatar
0 votes
0 answers
52 views

I have been using Spark v3.5 Spark Stream functionality for the below use case. I am observing the issue below on one of the environments with Spark Stream. Please if I can get some assistance with ...
Saurabh Agrawal's user avatar
0 votes
1 answer
143 views

I am trying to load the data written into the Kafka topic into the Postgres table. I can see the topic is receiving new messages every second and also the data looks good. However, when I use the ...
RushHour's user avatar
  • 645
0 votes
1 answer
125 views

ERROR SparkContext: Failed to add home/areaapache/software/spark-3.5.2-bin-hadoop3/jars/spark-streaming-kafka-0-10_2.13-3.5.2.jar \ to Spark environment import logging from pyspark.sql import ...
Lê Anh Tuấn 291N40's user avatar
3 votes
0 answers
83 views

I have Spark Streaming application lives on Argo + K8S that reads Kafka topics by subscribe pattern then there are some transformations and writing to a target. Several different producers may write ...
Александр Трутнев's user avatar
0 votes
1 answer
310 views

My goal is to run a Spark job using Databricks, and my challenge is that I can't store files in the local filesystem since the file is saved in the driver, but when my executors tried to access the ...
John Doe's user avatar
  • 433
1 vote
0 answers
400 views

I am working on spark streaming and reading data from kafka topic, but getting error java.lang.NoClassDefFoundError: org/apache/spark/kafka010/KafkaConfigUpdater. Running my code in K8s and provide ...
vivekdesai's user avatar
2 votes
0 answers
132 views

I have multiple topics in kafka that I need to sink in their respective delta table. A) 1 Streaming query for all topics If i use one streaming query, then the RDD/DF should contains data from ...
MaatDeamon's user avatar
  • 9,869
2 votes
1 answer
3k views

When I try to run this .py: import logging from cassandra.cluster import Cluster from pyspark.sql import SparkSession from pyspark.sql.functions import from_json, col from pyspark.sql.types import ...
francollado99's user avatar
2 votes
1 answer
558 views

I am in a bind here. I am trying to implement a very basic pipeline which reads data from kafka and process it in Spark. The problem I am facing is that apache spark shuts down abruptly giving the ...
Nanomachines Son's user avatar
1 vote
0 answers
99 views

I'm trying to read stream from Kafka using pyspark. The Stack I'm working with: Kubernetes. Stand alone spark cluster with 2 workers. spark-connect connected to the cluster and has the dependencies ...
waseemoo1's user avatar
0 votes
1 answer
171 views

I can't write to Kafka from Spark, Spark is reading but not writing, if I write to the console it doesn't give an error Traceback (most recent call last): File "f:\Sistema de Informação\TCC\...
Ingrid Iplinsky's user avatar
0 votes
1 answer
78 views

I have been trying to complete a project in which I needed to send data stream using kafka to local Spark to process the incoming data. However I can not show and use the data frame in the right ...
AFORS's user avatar
  • 11
0 votes
0 answers
33 views

Hello I am trying to use pyspark + kafka in order to do this I execute this command in order to set up the Spark application Spark version is 3.5.0 | spark-3.5.0-bin-hadoop3 Kafka version is - ...
Nícolas Farfán Cheneaux's user avatar
0 votes
0 answers
273 views

I'm trying to read data from kafka topic by using spark structured streaming on ec2(ubuntu) machine. If I try to read the data by using kafka stream only(kafka-console-consumer.sh) then there is no ...
Rushi's user avatar
  • 25

15 30 50 per page
1
2 3 4 5
8