DataStax Help Center

Accessing files on CFS results in "Remote CFS sblock not found"

Summary

This article discusses issues with accessing files on the Cassandra filesystem.

Symptoms

Some files on CFS are not accessible. Here are example errors generated by the dse hadoop fs -copyToLocal command:

ERROR [Thrift:13305] 2016-03-09 15:22:51,765 DseServer.java:366 - Remote CFS sblock not found: java.nio.HeapByteBuffer[pos=0 lim=32 cap=32]:org.apache.cassandra.db.composites.SimpleDenseCellName@cf68ee9c 
org.apache.cassandra.thrift.NotFoundException: null 
at com.datastax.bdp.server.DseServer.validateAndGetCell(DseServer.java:452) [dse-core-4.8.4.jar:4.8.4] 
at com.datastax.bdp.server.DseServer.getRemoteSubBlockCell(DseServer.java:424) [dse-core-4.8.4.jar:4.8.4] 
at com.datastax.bdp.server.DseServer.getRemoteSubBlockCell(DseServer.java:392) [dse-core-4.8.4.jar:4.8.4] 
at com.datastax.bdp.server.DseServer.getRemoteSubBlock(DseServer.java:335) [dse-core-4.8.4.jar:4.8.4] 
at com.datastax.bdp.server.DseServer.get_remote_cfs_sblock(DseServer.java:298) [dse-core-4.8.4.jar:4.8.4] 
at org.apache.cassandra.thrift.Dse$Processor$get_remote_cfs_sblock.getResult(Dse.java:1322) [dse-core-4.8.4.jar:4.8.4] 
at org.apache.cassandra.thrift.Dse$Processor$get_remote_cfs_sblock.getResult(Dse.java:1306) [dse-core-4.8.4.jar:4.8.4] 
...
ERROR [Thrift:13305] 2016-03-09 15:22:51,765 ProcessFunction.java:41 - Internal error processing get_remote_cfs_sblock 
org.apache.thrift.TApplicationException: Remote CFS sblock not found: java.nio.HeapByteBuffer[pos=0 lim=32 cap=32]:org.apache.cassandra.db.composites.SimpleDenseCellName@cf68ee9c 
at com.datastax.bdp.server.DseServer.getRemoteSubBlock(DseServer.java:367) ~[dse-core-4.8.4.jar:4.8.4] 
at com.datastax.bdp.server.DseServer.get_remote_cfs_sblock(DseServer.java:298) ~[dse-core-4.8.4.jar:4.8.4] 
at org.apache.cassandra.thrift.Dse$Processor$get_remote_cfs_sblock.getResult(Dse.java:1322) ~[dse-core-4.8.4.jar:4.8.4] 
at org.apache.cassandra.thrift.Dse$Processor$get_remote_cfs_sblock.getResult(Dse.java:1306) ~[dse-core-4.8.4.jar:4.8.4]
...

Cause

This issue relates to subblocks of the file on CFS not being available on replicas.

A file which is consistent on all replicas will return the following sample result from dsetool checkcfs command:

$ bin/dsetool checkcfs /myCFSdir/spark-10-day-loss.jar
Path: cfs://10.1.2.3/myCFSdir/spark-10-day-loss.jar
  INode header:
    File type: FILE
    User: automaton
    Group: automaton
    Permissions: rwxrwxrwx (777)
    Block size: 67108864
    Compressed: true
    First save: true
    Modification time: Sat Apr 02 03:44:36 UTC 2016
  INode:
    Block count: 1
    Blocks:                               subblocks     length         start           end
      (B) 3865c280-f885-11e5-bd7e-8bf6f52da1d7:   1      49803             0         49803
          386637b0-f885-11e5-bd7e-8bf6f52da1d7:          49803             0         49803
  Block locations:
    3865c280-f885-11e5-bd7e-8bf6f52da1d7: [10.1.2.3, 10.1.2.4, 10.1.2.5]
  Data:
    All data blocks ok.

Here is an example output for a file with unreadable subblocks:

$ bin/dsetool checkcfs /user/datastax/myLargeFile.txt
Path: cfs://10.4.5.6/user/datastax/myLargeFile.txt
  INode header:
    File type: FILE
    User: datastax
    Group: datastax
    Permissions: rwxrwxrwx (777)
    Block size: 67108864
    Compressed: true
    First save: true
    Modification time: Sat Apr 02 03:56:12 UTC 2016
  INode:
    Block count: 320
    Blocks:                               subblocks     length         start           end
      (B) f0d40d80-e5ad-11e5-ad08-139e6dddc791:  32   67108864             0      67108864
          f3503110-e5ad-11e5-ad08-139e6dddc791:        2097152             0       2097152
          f40f6260-e5ad-11e5-ad08-139e6dddc791:        2097152       2097152       4194304
...
  Data:
    Error: Failed to read subblock: 45e5fd90-e5b5-11e5-ad08-139e6dddc791 (cause: org.apache.thrift.TApplicationException: Remote CFS sblock not found: java.nio.HeapByteBuffer[pos=0 lim=32 cap=32]:org.apache.cassandra.db.composites.SimpleDenseCellName@cf68ee9c)
    Error: Failed to read subblock: 468ec0b0-e5b5-11e5-ad08-139e6dddc791 (cause: org.apache.thrift.TApplicationException: Remote CFS sblock not found: java.nio.HeapByteBuffer[pos=0 lim=32 cap=32]:org.apache.cassandra.db.composites.SimpleDenseCellName@ef774e38)
    Error: Failed to read subblock: 473fe840-e5b5-11e5-ad08-139e6dddc791 (cause: org.apache.thrift.TApplicationException: Remote CFS sblock not found: java.nio.HeapByteBuffer[pos=0 lim=32 cap=32]:org.apache.cassandra.db.composites.SimpleDenseCellName@7bdbfe3d)
    Error: Failed to read subblock: 7a6a8d00-e5b6-11e5-ad08-139e6dddc791 (cause: org.apache.thrift.TApplicationException: Remote CFS sblock not found: java.nio.HeapByteBuffer[pos=0 lim=32 cap=32]:org.apache.cassandra.db.composites.SimpleDenseCellName@f88d9ac0)

Solution

CFS subblocks are stored just like any other Cassandra data in the keyspace cfs. Run a repair on the keyspace in order to bring the replicas in sync.

Step 1 - On the first replica node, run the following repair command:

$ nodetool repair -pr -- cfs

Step 2 - Repeat the step above one node at a time until all nodes in the data centre have been completed.

Step 3 - Repeat the steps above on the next data centre until all DCs have been completed.

See also

DataStax Blog - Repair in Cassandra

DataStax Support KB - How to store files on CFS and get the correct path

DataStax doc - nodetool repair

Was this article helpful?
2 out of 2 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk