Modified UTF-8 byte[] to String decoding broken in multiple places
Platform code is broken in multiple locations with regard to decoding characters that are encoded using 3 bytes in [Modified] UTF-8. Modified UTF-8 data encoding seems to be correct so no exported data on disk should be broken. Problems are only in the decoding (import) code and those problems can and will be fixed.
The discovered bugs affect:
- Transferable graph (TG) file importing and TG identity name and type name decoding therein. Causes either potential failure to import previously exported models or unexpected changes in the problematic resource's names if any of the model's entities contains characters that would be encoded using 3 bytes in Modified UTF-8. See https://en.wikipedia.org/wiki/UTF-8#Examples for more information on the subject. Such characters are for example http://www.fileformat.info/info/unicode/char/6f22/index.htm.
- Database indexing where it reads resource names from the database using a flawed modified UTF-8 decoding routine.
Broken Modified UTF-8 decoding code can be found in:
org.simantics.graph.representation.ByteFileReader.utf
fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.utf
Other utf-related problems are/were
-
org.simantics.graph.db.StreamingTransferableGraphFileReader.forIdentities
reads Identity lengths assuming the are always only max. 127 characters in length. However this was fixed in commit 69d8f2b1 for branch release/1.34.0 and could be backported to earlier releases as well. -
org.simantics.graph.representation.ByteFileReader.utf
assumes that no modified UTF-8 strings decoded with it are more than 128-384 characters in length depending on byte encoding of the characters (uses a shared internal buffer that is 3*128 characters in length). Note thatByteFileReader.utf
is used only for TG identity name/type name reading so it is unlikely that those limits are reached in any concrete cases. However this is easy to fix, it's just badly optimized code.
The database indexing crash looks like this:
2018-06-25 14:01:00,643 514945863 [ Worker-81] ERROR ics.db.common.utils.LogManager - Unexpected exception in ReadGraph.syncRequest(Read,Procedure)
org.simantics.db.exception.DatabaseException: Unexpected exception in ReadGraph.syncRequest(Read,Procedure)
at org.simantics.db.impl.graph.ReadGraphImpl.syncRequest(ReadGraphImpl.java:1995)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process.<init>(DependenciesRelation.java:158)
at org.simantics.db.layer0.genericrelation.DependenciesRelation.find(DependenciesRelation.java:196)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$1.realize(DependenciesRelation.java:219)
at org.simantics.db.indexing.IndexedRelationsSearcherBase.initializeIndexImpl(IndexedRelationsSearcherBase.java:642)
at org.simantics.db.indexing.IndexedRelationsSearcherBase.lambda$0(IndexedRelationsSearcherBase.java:596)
at org.simantics.db.indexing.internal.IndexingJob$1.run(IndexingJob.java:69)
at org.eclipse.core.internal.jobs.Worker.run(Worker.java:55)
Caused by: java.lang.ArrayIndexOutOfBoundsException: 20
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.utf(DirectQuerySupportImpl.java:1054)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.getDirectValue4(DirectQuerySupportImpl.java:936)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.getValue4(DirectQuerySupportImpl.java:774)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.getRelatedDirectValue4(DirectQuerySupportImpl.java:823)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.getRelatedValue4(DirectQuerySupportImpl.java:568)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.forPossibleRelatedValueCompiled(DirectQuerySupportImpl.java:355)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process$3.execute(DependenciesRelation.java:143)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process$3.execute(DependenciesRelation.java:1)
at org.simantics.db.impl.ForEachObjectContextProcedure.execute(ForEachObjectContextProcedure.java:31)
at org.simantics.db.impl.ForEachObjectContextProcedure.execute(ForEachObjectContextProcedure.java:1)
at org.simantics.db.procore.cluster.IntHash.foreachInt(IntHash.java:301)
at org.simantics.db.procore.cluster.ObjectTable.foreachObject(ObjectTable.java:179)
at org.simantics.db.procore.cluster.ClusterSmall.forObjects(ClusterSmall.java:299)
at org.simantics.db.procore.cluster.ClusterSmall.forObjects(ClusterSmall.java:486)
at fi.vtt.simantics.procore.internal.QuerySupportImpl.getObjects4(QuerySupportImpl.java:357)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.forEachObjectCompiled(DirectQuerySupportImpl.java:277)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process$3.execute(DependenciesRelation.java:142)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process$3.execute(DependenciesRelation.java:1)
at org.simantics.db.impl.ForEachObjectContextProcedure.execute(ForEachObjectContextProcedure.java:31)
at org.simantics.db.impl.ForEachObjectContextProcedure.execute(ForEachObjectContextProcedure.java:1)
at org.simantics.db.procore.cluster.IntHash.foreachInt(IntHash.java:301)
at org.simantics.db.procore.cluster.ObjectTable.foreachObject(ObjectTable.java:179)
at org.simantics.db.procore.cluster.ClusterSmall.forObjects(ClusterSmall.java:299)
at org.simantics.db.procore.cluster.ClusterSmall.forObjects(ClusterSmall.java:491)
at fi.vtt.simantics.procore.internal.QuerySupportImpl.getObjects4(QuerySupportImpl.java:357)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.forEachObjectCompiled(DirectQuerySupportImpl.java:277)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process$3.execute(DependenciesRelation.java:142)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process$3.execute(DependenciesRelation.java:1)
at org.simantics.db.impl.ForEachObjectContextProcedure.execute(ForEachObjectContextProcedure.java:31)
at org.simantics.db.impl.ForEachObjectContextProcedure.execute(ForEachObjectContextProcedure.java:1)
at org.simantics.db.procore.cluster.IntHash.foreachInt(IntHash.java:301)
at org.simantics.db.procore.cluster.ObjectTable.foreachObject(ObjectTable.java:179)
at org.simantics.db.procore.cluster.ClusterSmall.forObjects(ClusterSmall.java:299)
at org.simantics.db.procore.cluster.ClusterSmall.forObjects(ClusterSmall.java:491)
at fi.vtt.simantics.procore.internal.QuerySupportImpl.getObjects4(QuerySupportImpl.java:357)
at fi.vtt.simantics.procore.internal.DirectQuerySupportImpl.forEachObjectCompiled(DirectQuerySupportImpl.java:277)
at org.simantics.db.layer0.genericrelation.DependenciesRelation$Process$4.run(DependenciesRelation.java:162)
at org.simantics.db.common.request.ReadRequest.perform(ReadRequest.java:21)
at org.simantics.db.impl.query.QueryProcessor.tryQuery(QueryProcessor.java:5192)
at org.simantics.db.impl.graph.ReadGraphImpl.syncRequest(ReadGraphImpl.java:1986)
... 7 more
Edited by Tuukka Lehtonen