Apache Spark Connector

Apache Spark Connector

The high-performance Apache Spark Connector provides read/write capabilities within your application, allowing you to perform numerous Apache Spark operations with minimal coding or even no-coding at all. Integrate Apache Spark data into applications such as SSIS, SQL Server, any ODBC-compatible application, or even directly within a programming language using this connector.

Download

Integrate Apache Spark with these applications

All
Data Integration
Database
BI & Reporting
Productivity
Programming Languages
Automation & Scripting
ODBC applications

SQL examples for Apache Spark Connector

Use these example Apache Spark SQL queries within SSIS, SQL Server or any ODBC-compatible application:

Select specific employees

Selects specific columns for employees in the Sales department. A basic filtering query.

-- Basic SELECT with a WHERE clause
SELECT
    id,
    name,
    salary
FROM employees
WHERE department = 'Sales';

Rank employees by salary (Window Function)

Uses the ROW_NUMBER() window function to assign a rank to employees based on salary (highest first) within each department.

-- Using a window function for ranking
SELECT
    id,
    name,
    department,
    salary,
    ROW_NUMBER() OVER (PARTITION BY department ORDER BY salary DESC) as salary_rank_in_dept
FROM employees;

Extract data from JSON string

Uses the built-in get_json_object function to extract the 'city' field from a JSON string stored in the 'user_data' column.

-- Assumes a table 'user_activity' with a STRING column 'user_data' containing JSON like:
-- {"name": "Alice", "address": {"street": "123 Main", "city": "Anytown"}}
SELECT
    user_id,
    get_json_object(user_data, '$.address.city') AS city
FROM user_activity
WHERE user_data IS NOT NULL; -- Example filter

Explode array elements (lateral view)

Uses LATERAL VIEW explode() to transform array elements into separate rows, finding users interested in 'hiking'. Assumes 'user_profiles' table has 'interests' ARRAY<STRING> column.

-- Using LATERAL VIEW to query array elements
SELECT DISTINCT
    up.user_id,
    up.name
FROM user_profiles up
LATERAL VIEW explode(interests) interest_table AS single_interest
WHERE single_interest = 'hiking';

Extract browser from User Agent (Regex)

Uses the regexp_extract function to parse a browser name (Firefox, Chrome, Safari, or MSIE) from a 'user_agent' string column in 'web_logs'.

-- Using a regular expression function
SELECT
    user_agent,
    regexp_extract(user_agent, '(Firefox|Chrome|Safari|MSIE)', 1) AS browser
FROM web_logs
LIMIT 100; -- Example limit