Tuesday, October 24, 2017

VNC server won't start on my CentOS 7

I had to ask my boss to power down and power up the desktop in my office this morning. After the desktop came back, I could not access the vncserver because it failed to start. And I got this error when I run systemctl.
$ sudo systemctl start vncserver@\:2

Job for vncserver@:2.service failed because a configured resource limit was exceeded. See "systemctl status vncserver@:2.service" and "journalctl -xe" for details.
And I also found a Xvnc process started on :1 every time when I start the service.
I did find that `/usr/lib/systemd/system/vncserver@:2.service was missing and restored it like below:
[Unit]
Description=Remote desktop service (VNC)
After=syslog.target network.target

[Service]
Type=forking
User=bwang

# Clean any existing files in /tmp/.X11-unix environment
ExecStartPre=-/usr/bin/vncserver -kill %i
ExecStart=/usr/bin/vncserver %i -nolisten tcp -localhost -geometry 1920x1080 -de
pth 24 
PIDFile=/home/bwang/.vnc/%H%i.pid
ExecStop=-/usr/bin/vncserver -kill %i

[Install]

WantedBy=multi-user.target
And I ran systemctl daemon-reload, systemctl disable vncserver@\:2.service, systemctl enable vncserver@\:2.service multiple times, but got the same error again and again.
I also tried using vncserver\@\:2 or vncserver@:2 for service name, it won’t fix it.
Finally when I ran vncserver from the command line, it showed this message
$ vncserver :2 -nolisten tcp -localhost -geometry 1920x1080 -depth 24

Warning: bwang.corp.rhapsody.com:2 is taken because of /tmp/.X11-unix/X2
Remove this file if there is no X server bwang.corp.rhapsody.com:2
A VNC server is already running as :2

New 'bwang.corp.rhapsody.com:1 (bwang)' desktop is bwang.corp.rhapsody.com:1

Starting applications specified in /home/bwang/.vnc/xstartup
Log file is /home/bwang/.vnc/bwang.corp.rhapsody.com:1.log
Obviously, when the deskotp was powered down, /tmp/.X11-unix/X2 was left on the hard drive, and it blocked vncserver from starting on display 2 again.
Removing /tmp/.X11-unix/X2 fix the problem.
NOTE: using vncserver@:2 without escaping : is ok when running systemctl.

Wednesday, October 4, 2017

Spark SQL AnalysisException due to data type mismatch

If you run the following code, you will encounter AnalysisException with data type mismatch.
spark.sql("create table test_table(id int, data struct<name: string, age:int>)")
spark.sql("insert overwrite table test_table select 1, struct('joe', 15)")
val d = spark.read.table("test_table")

case class UserData(name: String, age: Int)
case class User(id: Int, data: UserData)
val u = Seq(User(1, UserData("joe", 15)), User(2, UserData("mary", 10))).toDF

val missing = {
  u.join(
    d,
    u("id") === d("id")
    && u("data")("name") === d("data")("name")
    && u("data")("age") === d("data")("age"),
    "outer"
  )
  .where( u("id").isNull || d("id").isNull)
}

// show this result
// +---+---------+----+----+                                                      
// |id |data     |id  |data|
// +---+---------+----+----+
// |2  |[mary,10]|null|null|
// +---+---------+----+----+
missing.show(false)

// Throws this error: org.apache.spark.sql.AnalysisException: cannot resolve '(`data` = test_table.`data`)'
// due to data type mismatch: differing types in '(`data` = test_table.`data`)'
// (struct<name:string,age:int> and struct<name:string,age:int>).;
val missing_2 = {
  u.join(d,
         u("id") === d("id") && u("data") === d("data"),
         "outer")
    .where(u("id").isNull || d("id").isNull)
}
Don’t be fooled by (struct<name:string,age:int> and struct<name:string,age:int>). The problem is caused by nullable, which is not shown in the error message. The DataFrame created from case classes has nullable=false for id and age because Scala Int cannot be null, while the SQL creates nullable fields. And if you compare a field with complex type (struct, array), Spark just thinks they are different as shown in missing_2. But if you compare field by field, there is no problem as shown in missing.
scala> u.schema.treeString
res4: String =
"root
 |-- id: integer (nullable = false)
 |-- data: struct (nullable = true)
 |    |-- name: string (nullable = true)
 |    |-- age: integer (nullable = false)
"

scala> d.schema.treeString
res5: String =
"root
 |-- id: integer (nullable = true)
 |-- data: struct (nullable = true)
 |    |-- name: string (nullable = true)
 |    |-- age: integer (nullable = true)
"