Questions tagged [Apache Spark] (6)

2
answers

What projects to compose a portfolio in Java?

Learning Java is 1.5 years, familiar with Java core, versed in Spring and Hibernate, I know a little Spark. What projects should do for a portfolio, because it turns out that I'm studying technologies , apply them separately from each other. but the sense in all this. UPD: Preferably with the use of the database.
armani asked April 1st 20 at 18:06
0
answer

How to add an existing csv file data?

Hello, comrades! Please help to understand. On the SFTP server is a CSV file with the data. In the file there are 4 columns. Is it possible by means of Scala (Spark) to add to this CSV file is additional data at the beginning of the file? For example:| REQUEST_DATE | 2019-02-05 20:00:00 | | USER | Kate | | SEARCH_TYPE | Glo...
harmon_Runolfss asked March 20th 20 at 11:07
0
answer

Hadoop + ApacheSpark deploy cluster?

Hi, just rolling into the architecture. You want to deploy the hadoop for data storage, apache spark for processing. How will the architecture from the point of view of the servers? You need a lot of servers and collect them in cluster? Or will one cloud, where everything is spinning? What is the right thing to do?
Douglas_OReilly asked March 19th 20 at 08:40
0
answer

Spark Master as Driver in Kubernetes Cluster?

Hello, please tell me, as someone trying to launch Spark in Kubernetes? In fact, until googled the decisions that prompt.. In Kubernetes, I want to have with Spark-Master was a Driver. From which would run spark-submit, and accordingly would monitories job and was immediately visible in the Web UI and then in the History S...
Karianne asked March 17th 20 at 11:40
0
answer

Working with a DataFrame in Spark?

Hello, comrades! Please help to understand. Assume the DataFrame is stored in the first time the user navigates to the call center.+---------+-------------------+ |USER_NAME| REQUEST_DATE| +---------+-------------------+ | Mark|2018-02-20 00:00:00| | Alex|2018-03-01 00:00:00| | Bob|2018-03-01 00:00:00| | Mark|2018-07-01 00:...
Leola_Sporer asked March 16th 20 at 13:43
0
answer

How to merge two DataFrame in Spark related?

Hello, colleagues! The Spark Framework and I can't figure out how to merge two DataFrame-and. The first DataFrame stores data about the user requests the service centre. This DataFrame can be called conditionally df1.+-----------+---------------------+ | USER_NAME | REQUEST_DATE | +-----------+---------------------+ | Alex ...
Howell_Murazi asked March 16th 20 at 13:42
0
answer

How to read multiple parquet files in Spark?

Hello, comrades! Please help to understand with Spark. There is a directory which contains a bunch of parquet files. The name of these files have a common format: "DD-MM-YYYY". For example: '01-10-2018', '02-10-2018', '03-10-2018' , etc. as input parameter, I receive the start date (dateFrom) and an end date (dateTo). The v...
Pinkie.Bartolet asked March 15th 20 at 23:05
2
answers

How to filter data for a certain period in Spark?

Hello, comrades! Please help to understand. Before, the Spark didn't work. Trying to deal with it on a simple example. Suppose you have a large file with the following structure (see below). It stores date, mobile number and its status at this time.| CREATE_DATE | MOBILE_KEY | STATUS | |---------------------|------------|--...
Jany.Reynolds90 asked March 15th 20 at 22:42
0
answer

As of spark (apache spark) in java to connect to MS SQL Server?

Hello! Tell me, please, what's the problem? There is java code that creates a session to spark, configures it and tries from spark to turn to MS SQL Server database: SparkConf conf = new SparkConf().setMaster("local[4]").setAppName("Word Count"); conf.set("spark.driver.extraClassPath","C:\\temp\\sqljdbc42.jar");//this path ...
hunter_Kunze73 asked June 21st 19 at 12:44
1
answer

How to monitor the spark?

Hello all Prompt who knows how to monitor the spark, without the use of third-party packages\engines
margarette_Wiso asked June 19th 19 at 20:42