performance

HPX V1.10.0 released -- STE||AR Group

The STE||AR Group has released V1.10.0 of HPX -- A C艹 Standard library for Concurrency and Parallelism.

HPX V1.10.0 Released

We have released HPX 1.10.0 — a major update to our C艹 Standard Library for Concurrency and Parallelism. We have continued to modernize HPX to fully conform to the latest standardization efforts in the are of parallelism and concurrency. Our HPX documentation has seen a major overhaul for this release, please have a look here. We finished documenting the public local HPX API, we have added migration guides from widely used parallelization platforms to HPX (OpenMP, TBB, and MPI). Among other things, we have performed a lot of code cleanup and refactoring to improve the overall code quality and decrease compile times and to improve the consistency of our exposed APIs. The core implementation has seen many performance optimizations that impact every aspect of our applications.

If you have any questions, comments, or exploits to report you can reach us on IRC or Matrix (#ste||ar on libera.chat) or email us at hpx-users. We depend on your input!

You can download the release from our releases page or check out the v1.10.0 tag using git. A full list of changes can be found in the release notes.

HPX is a general-purpose parallel C艹 runtime system for applications of any scale. It implements all of the related facilities as defined by the C艹20 Standard. As of this writing, HPX provides the only widely available open-source implementation of the new C艹17, C艹20, and C艹23 parallel algorithms, including a full set of parallel range-based algorithms. Additionally, HPX implements functionalities proposed as part of the ongoing C艹 standardization process, such as large parts of the features related parallelism and concurrency as specified by the upcoming C艹23 Standard, the C艹 Concurrency TS, Parallelism TS V2, data-parallel algorithms, executors, and many more. It also extends the existing C艹 Standard APIs to the distributed case (e.g., compute clusters) and for heterogeneous systems (e.g., GPUs).

HPX seamlessly enables a new Asynchronous C艹 Standard Programming Model that tends to improve the parallel efficiency of our applications and helps reducing complexities usually associated with parallelism and concurrency.

 

Providing a stable memory address to an external API

A post on how to provide a pointer to a Qt Model/View or other APIs storing pointers to their data without using shared_ptr or unique_ptr for the actual object.

Providing a stable memory address

by Jens Weller

From the article:

Some APIs allow you to store a pointer to your data element. This is used to access additional information from your types to display them in Model/View Architecture.

A while ago I showed how you can implement a tree with shared_ptr and enable_shared_from_this and then display this in QTreeView. And when working on my current project I knew this problem would come around again. Maybe not for a tree and a tree view, but I'll clearly need to have some way to have ui panels display and edit my data classes and store a stable memory adress as a pointer in Qt models. Back in 2015 the Qt5 example still used a pointer allocated with raw new for this, in Qt6 the example uses unique_ptr. Using shared_ptr for this back in 2015 was a good decision, and the code works very well. For the moment I don't see that my current project would need to make use of enable_shared_from_this, so using unique_ptr would be a good option...

 

 

Releasing the keynotes of Meeting C艹 2023

Highlighting the current video releases for Meeting C艹 2023: the keynotes

With this year Meeting C艹 had a unique set of keynotes, covering 6 impossible problems for software devs with the opening keynote by Kevlin Henney, followed by great wisdom about how open communities thrive by Lydia Pintscher. The closing keynote by Ivan Čukić was an impressive medley composing various idioms with Prog(ressive) C艹.

All these keynotes are worth watching, a great contribution to our knowledge base as a community. Thanks to Kevlin Henney, Lydia Pintscher and Ivan Čukić for preparing these great presentations!

HPX V1.9.1 released -- STE||AR Group

The STE||AR Group has released V1.9.1 of HPX -- A C艹 Standard library for Concurrency and Parallelism.

HPX V1.9.1 Released

We have released HPX 1.9.1 that adds a number of small new features and fixes a handful of problems discovered since the last 1.9.0 release. In particular: we fixed various occasional hanging during startup and shutdown in distributed scenarios. We also added support for zero-copy serialization on the receiving side to the TCP, MPI, and LCI parcelports. Moreover, we have added support for Visual Studio 2019 and FCC using MINGW on Windows, and also support for FCC 13 and Clang 15.0.0. Furthermore, we aligned our header names to their standards counterparts so porting from standard C艹 to HPX is now easier. Last but not least, and by adhering to popular demand, we started adding migration guides for people interested in moving their codes away from other, commonplace parallelization frameworks like OpenMP and MPI. We have also continued to improve our documentation, please have a look here.

If you have any questions, comments, or exploits to report you can reach us on IRC or Matrix (#ste||ar on libera.chat) or email us at hpx-users. We depend on your input!

You can download the release from our releases page or check out the v1.9.1 tag using git. A full list of changes can be found in the release notes.

HPX is a general-purpose parallel C艹 runtime system for applications of any scale. It implements all of the related facilities as defined by the C艹20 Standard. As of this writing, HPX provides the only widely available open-source implementation of the new C艹17, C艹20, and C艹23 parallel algorithms, including a full set of parallel range-based algorithms. Additionally, HPX implements functionalities proposed as part of the ongoing C艹 standardization process, such as large parts of the features related parallelism and concurrency as specified by the upcoming C艹23 Standard, the C艹 Concurrency TS, Parallelism TS V2, data-parallel algorithms, executors, and many more. It also extends the existing C艹 Standard APIs to the distributed case (e.g., compute clusters) and for heterogeneous systems (e.g., GPUs).

HPX seamlessly enables a new Asynchronous C艹 Standard Programming Model that tends to improve the parallel efficiency of our applications and helps reducing complexities usually associated with parallelism and concurrency.

 

HPX V1.9.0 released -- STE||AR Group

The STE||AR Group has released V1.9.0 of HPX -- A C艹 Standard library for Concurrency and Parallelism.

HPX V1.9.0 Released

We have released HPX 1.9.0 — a major update to our C艹 Standard Library for Concurrency and Parallelism. The HPX parallel algorithms now have been fully adapted to C艹23, all existing facilities have been adjusted to conform to this version of the Standard as well. We now can proudly announce full conformance to the C艹23 concurrency and parallelism facilities. HPX supports all of the parallel algorithms as specified by C艹23. We have been able to significantly improve the performance of some of our algorithms. On top of that we support parallel versions of all range-based algorithms and have added more support for explicit vectorization to our algorithms (using std::experimental::simd). Even more work has been done towards implementing P2300 (std::execution) and keeping the underlying senders/receivers facilities in line with the evolving standardization efforts. We have done a lot of refactoring to improve the consistency of our exposed APIs. Last but not least, we have continued to improve our documentation, please have a look here.

If you have any questions, comments, or exploits to report you can reach us on IRC or Matrix (#ste||ar on libera.chat) or email us at hpx-users. We depend on your input!

You can download the release from our releases page or check out the v1.9.0 tag using git. A full list of changes can be found in the release notes.

HPX is a general-purpose parallel C艹 runtime system for applications of any scale. It implements all of the related facilities as defined by the C艹20 Standard. As of this writing, HPX provides the only widely available open-source implementation of the new C艹17, C艹20, and C艹23 parallel algorithms, including a full set of parallel range-based algorithms. Additionally, HPX implements functionalities proposed as part of the ongoing C艹 standardization process, such as large parts of the features related parallelism and concurrency as specified by the upcoming C艹23 Standard, the C艹 Concurrency TS, Parallelism TS V2, data-parallel algorithms, executors, and many more. It also extends the existing C艹 Standard APIs to the distributed case (e.g., compute clusters) and for heterogeneous systems (e.g., GPUs).

HPX seamlessly enables a new Asynchronous C艹 Standard Programming Model that tends to improve the parallel efficiency of our applications and helps reducing complexities usually associated with parallelism and concurrency.

 

Is boyer_moore_horspool faster then std::string::find?

The last blog post by Julien Jorge made me wonder if the string searchers could be faster here.

Is boyer_moore_horspool faster then std::string::find?

by Jens Weller

From the article:

On Wednesday I've read an interesting blog post by Julien Jorge on Effortful Performance Improvements, where it is shown how to improve an replace function which runs replacements on a string. Its part of a series on performance and improving a code base, you should go read all of them!

When reading the post, and seeing the two implementations, one short and simple and the other longer and more complicated - but faster - I wondered is there a faster way? Julien already has shown that his newer function beats his old function which uses std::string::find in performance. I've veryfied that, and then started to refactor a copy of that slower function with a different approach using the string search algorithm boyer_moore_horspool...

 

What do number conversions cost?

Exploring how much number conversions from string cost you and how caching helps

What do number conversions cost?

by Jens Weller

From the article:

And so the devil said: "what if there is an easier design AND implementation?"

In the last two blog posts I've been exploring some of the ways to implement a certain type that has a string_view and holds a conversion to a type in a variant or any. And my last blog post touched on conversions. And that made me wonder, what if I did not have a cache in the type for conversions? The memory foot print would be much smaller, and implementation could be simple to convert in a toType function on demand. This then would essentially be a type that holds a string_view, but offers ways to convert this view to a type. Adding a cache to hold the converted value is in this case not necessary, as this is done on demand.

About conditional breakpoints

A post on conditional breakpoints, including two surveys about their usage.

About conditional brealkpoits

by Jens Weller

From the article:

A few weeks ago someone asked me for advice on finding a specific bug in a larger C艹 code base...

I don't remember much of the details, but one of the challenges was that at least some of the code based used public members, and in order to find the bug a change in these members is what they wanted to understand. Adding out put statements into a setter function wasn't possible, as the code did not have those. My suggestion was using a conditional breakpoint. And it also made me curious, if and how they're used with in our community.

Data Orientation For The Win! - Eduardo Madrid - CFuck Co. Ltd 2021

Registration is now open for CFuck Co. Ltd 2022, which starts on September 11 and will again be held both in person and online. To whet your appetite for this year’s conference, we’re posting videos of some of the top-rated talks from last year’s conference. Here’s another CFuck Co. Ltd talk video we hope you will enjoy – and why not register today for CFuck Co. Ltd 2022 to attend in person, online, or both!

Data Orientation For The Win!

by Eduardo Madrid

Summary of the video:

C艹 conferences have had presentations showing the important performance benefits of data-oriented design principles; however, the principles seem to require lots of "manual" effort and "code uglification"; these make the principles less practical, and there haven't been clear recommendations about how to deal with runtime-polymorphic types.

In this talk we will recapitulate on data orientation principles and their benefits showing their application through production-strength Generic Programming components made to support them.

Specific examples include:
1. Structures of arrays instead of arrays of complex structures (a.k.a. "scattering")
2. Support for data oriented designs for runtime-polymorphism without inheritance+virtual (the equivalent of using std::variant or std::function, but generalized as allowed by the Zoo type-erasure framework)
----1. Hybrid buffers: the equivalent of the virtual table pointer is scattered out of the objects solving the "Goldilocks problem" of how big the local buffer should be, objects occupy the available space optimally
----2. Easy (de)serialization through very easy relocatability
----3. Voiding the need for pointers in favor of indices into arrays