1st International Workshop on Benchmarking of XML and Semantic Web Applications
(BenchmarX'09) - April 20, 2009
Accepted Papers
Radim Baca and Michal Kratky: TJDewey - On the Efficient Path Labeling Scheme Holistic Approach
In recent years, many approaches to XML twig pattern searching have been developed. Holistic approaches are particularly significant in that they provide a theoretical model for optimal processing of some query classes and have very low main memory complexity. Holistic algorithms can be incorporated into XQuery algebra as a twig query pattern operator.
We can find two types of labeling schemes used by indexing methods: element and path labeling schemes. The path labeling scheme is a labeling scheme where we can extract all the ancestor's labels from a node label. In the TJFast method, authors have introduced an application of the path labeling scheme (Extended Dewey) in the case of holistic methods. In our paper, we depict some improvements of this method that lead to a better scalability of the TJFast algorithm. We introduce the TJDewey algorithm which combines the TJFast algorithm with the DataGuide summary tree. The path labeling schemes have better update features and our article shows that the utilization of a path labeling scheme can have comparable or even better query processing parameters compared to other element labeling scheme approaches.
Suren Chilingaryan: The XMLBench Project: Comparison of fast, multi-platform XML libraries
The XML technologies have brought a lot of new ideas and abilities in the field of information management systems. Nowadays, XML is used almost everywhere: from small configuration files to multi-gigabyte archives of measurements. Many network services are using XML as transport protocol. XML based applications are utilizing multiple XML technologies to simplify software development: DOM is used to create and navigate XML documents, XSD schema is used to check consistency and validity, XSL simplifies transformation between different formats, XML Encryption and Signature establishes secure and trustworthy way of information exchange and storage. These technologies are provided by multiple commercial and open source libraries which are significantly varied in features and performance. Moreover, some libraries are optimized to certain tasks and, therefore, the actual library performance could significantly vary depending on the type of data processed.
XMLBench project was started to provide comprehensive comparison of available XML toolkits in their functionality and ability to sustain required performance. The main target was fast C and C++ libraries able to work on multiple platforms. The applied tests compare different aspects of XML processing and are run on few auto-generated data sets emulating library usage for different tasks. The details of test setup and achieved results will be presented.
Curtis Dyreson and Hao Jin: A Synthetic, Trend-Based Benchmark for XPath
Interest in querying XML is increasing as it becomes an important medium for data representation and exchange. A core component in most XML query languages is XPath. This paper describes a benchmark for comparing the performance of XPath query evaluation engines. The benchmark consists of an XML document generator which generates synthetic XML documents using a variety of benchmark-specific control factors. The benchmark also has a set of queries to compare XPath evaluation for each control factor. This paper reports on the performance of several, popular XPath query engines using the benchmark and draws some general inferences from the performance.
Sherif Sakr: An Empirical Evaluation of XML Compression Tools
This paper presents an extensive experimental study of the state-of-the-art of XML compression tools. The study reports the behavior of nine XML compressors using a large corpus of XML documents which covers the different natures and scales of XML documents. In addition to assessing and comparing the performance characteristics of the evaluated XML compression tools, the study tries to assess the effectiveness and practicality of using these tools in the real world. Finally, we provide some guidelines and recommendations which are useful for helping developers and users for making an effective decision for selecting the most suitable XML compression tool for their needs.
Karsten Schmidt, Sebastian Bachle and Theo Harder: Benchmarking Performance-Critical Components in a Native XML Database System
The rapidly increasing number of XML-related applications indicates a growing need for efficient, dynamic, and native XML support in database management systems (XDBMS). So far, both industry and academia primarily focus on benchmarking of high-level performance figures for a variety of applications, queries, or documents - frequently executed in artificial workload scenarios - and, therefore, may analyze and compare only specific or incidental behavior of the underlying systems. To cover the full XDBMS support, it is mandatory to benchmark performance-critical components bottom-up, thereby removing bottlenecks and optimizing component behavior. In this way, wrong conclusions are avoided when new techniques such as tailored XML operators, index types, or storage mappings with unfamiliar performance characteristics are used. As an experience report, we present what we have learned from benchmarking a native XDBMS and recommend certain setups to do it in a systematic and meaningful way.
Pavel Strnad and Michal Valenta: On Transaction Manager's Benchmarking
We describe an idea of measuring the performance of a transaction manager's performance. We design a very simple benchmark intended for evaluating this important component of a DB engine. Then we apply it to our own transaction manager's implementation. We also describe the implementation of the transaction manager itself. It is done as a software layer over the eXist database engine. It is a standalone module which can be used to extend eXist functionality by transactional processing when it is needed.