spark redis key column mapping not working - null returned
up vote
1
down vote
favorite
I am following the example from spark-redis connector documentation to create Dataframes from a an existing hash.
The hash built as follows:
127.0.0.1:6379> hset person:1 name John age 30
(integer) 2
127.0.0.1:6379> hset person:2 name Peter age 45
(integer) 2
127.0.0.1:6379> hset person:3 name James age 40
The code to read the hash is as follows -
SparkSession spark = SparkSession
.builder()
.appName("MyApp")
.master("local[*]")
.config("spark.redis.host", "localhost")
.config("spark.redis.port", "6379")
.getOrCreate();
Dataset<Row> df = spark.read()
.format("org.apache.spark.sql.redis")
.schema(new StructType(new StructField {
DataTypes.createStructField("id", DataTypes.StringType, true),
DataTypes.createStructField("name", DataTypes.StringType, false),
DataTypes.createStructField("age", DataTypes.IntegerType, false)
})
)
.option("keys.pattern", "person:*")
.option("key.column", "id")
.load();
df.show();
df.printSchema();
Output
+----+-----+---+
| id| name|age|
+----+-----+---+
|null| John| 30|
|null|James| 40|
|null|Peter| 45|
+----+-----+---+
root
|-- id: string (nullable = true)
|-- name: string (nullable = false)
|-- age: integer (nullable = false)
I was expecting 1, 2, 3 respectively in the Id column, but get null instead. Any pointers in this regard will help. Also this code is in Java so I am not sure if there an issue with data types.
apache-spark dataframe redis key
add a comment |
up vote
1
down vote
favorite
I am following the example from spark-redis connector documentation to create Dataframes from a an existing hash.
The hash built as follows:
127.0.0.1:6379> hset person:1 name John age 30
(integer) 2
127.0.0.1:6379> hset person:2 name Peter age 45
(integer) 2
127.0.0.1:6379> hset person:3 name James age 40
The code to read the hash is as follows -
SparkSession spark = SparkSession
.builder()
.appName("MyApp")
.master("local[*]")
.config("spark.redis.host", "localhost")
.config("spark.redis.port", "6379")
.getOrCreate();
Dataset<Row> df = spark.read()
.format("org.apache.spark.sql.redis")
.schema(new StructType(new StructField {
DataTypes.createStructField("id", DataTypes.StringType, true),
DataTypes.createStructField("name", DataTypes.StringType, false),
DataTypes.createStructField("age", DataTypes.IntegerType, false)
})
)
.option("keys.pattern", "person:*")
.option("key.column", "id")
.load();
df.show();
df.printSchema();
Output
+----+-----+---+
| id| name|age|
+----+-----+---+
|null| John| 30|
|null|James| 40|
|null|Peter| 45|
+----+-----+---+
root
|-- id: string (nullable = true)
|-- name: string (nullable = false)
|-- age: integer (nullable = false)
I was expecting 1, 2, 3 respectively in the Id column, but get null instead. Any pointers in this regard will help. Also this code is in Java so I am not sure if there an issue with data types.
apache-spark dataframe redis key
Can you check whether the data type for id should beInteger
in the schema :DataTypes.createStructField("id", DataTypes.IntegerType, true)
instead ofStringType
– Pavithran Ramachandran
Nov 11 at 5:12
When I use IntegerType I still get null. When I set nullable parameter to false, I get a NPE for StringType and 0 for IntegerType.
– Jerry Pereira
Nov 11 at 6:20
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I am following the example from spark-redis connector documentation to create Dataframes from a an existing hash.
The hash built as follows:
127.0.0.1:6379> hset person:1 name John age 30
(integer) 2
127.0.0.1:6379> hset person:2 name Peter age 45
(integer) 2
127.0.0.1:6379> hset person:3 name James age 40
The code to read the hash is as follows -
SparkSession spark = SparkSession
.builder()
.appName("MyApp")
.master("local[*]")
.config("spark.redis.host", "localhost")
.config("spark.redis.port", "6379")
.getOrCreate();
Dataset<Row> df = spark.read()
.format("org.apache.spark.sql.redis")
.schema(new StructType(new StructField {
DataTypes.createStructField("id", DataTypes.StringType, true),
DataTypes.createStructField("name", DataTypes.StringType, false),
DataTypes.createStructField("age", DataTypes.IntegerType, false)
})
)
.option("keys.pattern", "person:*")
.option("key.column", "id")
.load();
df.show();
df.printSchema();
Output
+----+-----+---+
| id| name|age|
+----+-----+---+
|null| John| 30|
|null|James| 40|
|null|Peter| 45|
+----+-----+---+
root
|-- id: string (nullable = true)
|-- name: string (nullable = false)
|-- age: integer (nullable = false)
I was expecting 1, 2, 3 respectively in the Id column, but get null instead. Any pointers in this regard will help. Also this code is in Java so I am not sure if there an issue with data types.
apache-spark dataframe redis key
I am following the example from spark-redis connector documentation to create Dataframes from a an existing hash.
The hash built as follows:
127.0.0.1:6379> hset person:1 name John age 30
(integer) 2
127.0.0.1:6379> hset person:2 name Peter age 45
(integer) 2
127.0.0.1:6379> hset person:3 name James age 40
The code to read the hash is as follows -
SparkSession spark = SparkSession
.builder()
.appName("MyApp")
.master("local[*]")
.config("spark.redis.host", "localhost")
.config("spark.redis.port", "6379")
.getOrCreate();
Dataset<Row> df = spark.read()
.format("org.apache.spark.sql.redis")
.schema(new StructType(new StructField {
DataTypes.createStructField("id", DataTypes.StringType, true),
DataTypes.createStructField("name", DataTypes.StringType, false),
DataTypes.createStructField("age", DataTypes.IntegerType, false)
})
)
.option("keys.pattern", "person:*")
.option("key.column", "id")
.load();
df.show();
df.printSchema();
Output
+----+-----+---+
| id| name|age|
+----+-----+---+
|null| John| 30|
|null|James| 40|
|null|Peter| 45|
+----+-----+---+
root
|-- id: string (nullable = true)
|-- name: string (nullable = false)
|-- age: integer (nullable = false)
I was expecting 1, 2, 3 respectively in the Id column, but get null instead. Any pointers in this regard will help. Also this code is in Java so I am not sure if there an issue with data types.
apache-spark dataframe redis key
apache-spark dataframe redis key
asked Nov 10 at 22:51
Jerry Pereira
132
132
Can you check whether the data type for id should beInteger
in the schema :DataTypes.createStructField("id", DataTypes.IntegerType, true)
instead ofStringType
– Pavithran Ramachandran
Nov 11 at 5:12
When I use IntegerType I still get null. When I set nullable parameter to false, I get a NPE for StringType and 0 for IntegerType.
– Jerry Pereira
Nov 11 at 6:20
add a comment |
Can you check whether the data type for id should beInteger
in the schema :DataTypes.createStructField("id", DataTypes.IntegerType, true)
instead ofStringType
– Pavithran Ramachandran
Nov 11 at 5:12
When I use IntegerType I still get null. When I set nullable parameter to false, I get a NPE for StringType and 0 for IntegerType.
– Jerry Pereira
Nov 11 at 6:20
Can you check whether the data type for id should be
Integer
in the schema : DataTypes.createStructField("id", DataTypes.IntegerType, true)
instead of StringType
– Pavithran Ramachandran
Nov 11 at 5:12
Can you check whether the data type for id should be
Integer
in the schema : DataTypes.createStructField("id", DataTypes.IntegerType, true)
instead of StringType
– Pavithran Ramachandran
Nov 11 at 5:12
When I use IntegerType I still get null. When I set nullable parameter to false, I get a NPE for StringType and 0 for IntegerType.
– Jerry Pereira
Nov 11 at 6:20
When I use IntegerType I still get null. When I set nullable parameter to false, I get a NPE for StringType and 0 for IntegerType.
– Jerry Pereira
Nov 11 at 6:20
add a comment |
1 Answer
1
active
oldest
votes
up vote
0
down vote
accepted
The version you're using doesn't support it - use the master tip until a new version is released.
xref: https://github.com/RedisLabs/spark-redis/issues/114
add a comment |
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
accepted
The version you're using doesn't support it - use the master tip until a new version is released.
xref: https://github.com/RedisLabs/spark-redis/issues/114
add a comment |
up vote
0
down vote
accepted
The version you're using doesn't support it - use the master tip until a new version is released.
xref: https://github.com/RedisLabs/spark-redis/issues/114
add a comment |
up vote
0
down vote
accepted
up vote
0
down vote
accepted
The version you're using doesn't support it - use the master tip until a new version is released.
xref: https://github.com/RedisLabs/spark-redis/issues/114
The version you're using doesn't support it - use the master tip until a new version is released.
xref: https://github.com/RedisLabs/spark-redis/issues/114
answered Nov 11 at 13:29
Itamar Haber
27.4k43659
27.4k43659
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53244206%2fspark-redis-key-column-mapping-not-working-null-returned%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Can you check whether the data type for id should be
Integer
in the schema :DataTypes.createStructField("id", DataTypes.IntegerType, true)
instead ofStringType
– Pavithran Ramachandran
Nov 11 at 5:12
When I use IntegerType I still get null. When I set nullable parameter to false, I get a NPE for StringType and 0 for IntegerType.
– Jerry Pereira
Nov 11 at 6:20