Support to Convert .fit Results to CSV (or any format)#393
Open
kevScheuer wants to merge 39 commits into
Open
Conversation
Converter directory now reflects that any other data converters may be added in the future, not just CSV.
The parameters are now saved with their errors. The verbose flag now controls the amount of output during processing.
Was requiring that amplitudes with common amp names in "reaction::sum::ampName" format be constrained to each other. Now it will save the mapping for unique amplitude groups, e.g. "ampName", "sum::ampName", or the full "reaction::sum::ampName" strings.
Files are accessed so many times it makes more sense to save them. File loading happens in the constructor now. Also added a background file bool for easy tracking of whether or not the background files are present. A template for getting the -t values is also added, but not yet implemented. This will also effect how the other distributions are handled.
The largest addition is a function that extracts the values of interest for the beam energy, which incorporates signal and background subtraction. To help with this, a min/max finder function was added to find a common min/max value for a branch across files. A few other report lines were added, and some fixes to compile properly.
Uses a RDataFrame method to compute t from the various 4-vector component branches, then fills a histogram with the t values. If background files are present, also computes a background histogram and subtracts it from the data histogram before calculating statistics. Aside from this, small reports and comments were added.
Removed the mass-branch arg, as the mass can be calculated from the labeled 4-vectors. The indices can now be set by the user. Aside from this, the files have been formatted.
Having the functions return the created histogram makes it: 1. Easier to understand the purpose of the function, and doesn't hide the map filling in the implementation 2. Allows for possibility of printing the hist for debugging purposes Also added a function to return the total number of events and its error
In order to save the coherent sums, a new AmplitudeParser class was created to parse the amplitude names and categorize them into groups based on the quantum numbers they contain. This relies on known "naming schemes" for the amplitudes. Currently the most common schemes are supported, with instructions for how to add new schemes.
Also added a quick method to get the reaction string, which was helpful for the normInt functions. This commit also includes some formatting.
|
Test status for this pull request: SUCCESS Summary: /work/halld/pull_request_test/halld_sim^csv_converter/tests/summary.txt Build log: /work/halld/pull_request_test/halld_sim^csv_converter/make_csv_converter.log |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This request is to merge a script and set of classes that will allow any Amptools-based analysis to convert their
.fitresults into a comma-separated value (CSV) file. Several plotters already exist for analyzing fit results per bin, and these are very well suited for analyzing the angular distributions, but mass-independent fits must "stitch" together their fit results to observe any behavior of the amplitudes and phases across mass bins. In addition, the 100s of fit results produced by bootstrap or randomized fits have no standard way to be aggregated. This CSV converter is designed to fill this gap in the analysis process. Below I've provided a short description for each component added.convert_to_csvThis is the primary script that users will interact with. A user with several fit results
result_1.fit,result_2.fit... can simply executeand a CSV will be made where each row corresponds to the
.fitfile, and the columns indicate AmpTools fit outputs, parameters, intensities, and phase differences.This CSV can then be read into a Python Pandas dataframe, ROOT tree or dataframe, or used by practically any programming language, and then plotted. The script is designed to be as generic as possible, so that any AmpTools-based analysis can use it. Listed are some more highlighted features of the script
--data-fileflag. It will read the associated data (with optional weights and/or background) files of the result and extract the info to a CSV file--lower-vertex-indicesflag. This tells theROOTDataConverterwhich 4-vector indices correspond to the upper or lower vertex, thus allowing the correct calculation of the mass and--naming-schemeAmplitudeParserfor more detailsFitConverterHandles the
.fit->.csvconversion. This class stores:AmplitudeParserbelow)Currently supports
.fit->.csvconversion, but can easily be expanded to any file format desired. This is because all the results of interest are stored in various maps, and so writing to CSV is as easy as iterating over the maps.ROOTDataConverterThis class is responsible for extracting the PWA-related information from a ROOT file. It stores:
Just like the FitConverter, any file format beyond CSV can be used. To get the info, the class uses the data and monte carlo files associated with the fit. If available, it also properly incorporates event weights or background files. As discussed above, to calculate the mass and$-t$ info, the user specifies the 4-vector indices.
AmplitudeParserThis was the biggest hurdle for generalizing the converter. A lot of times we are not just interested in the individual amplitudes and phases, but their (in)coherent sums, like "total reflectivity contribution" or "behavior of JL waves summed over the spin-projections". The problem is that these sums are typically defined manually, because the amplitudes (and thus their quantum numbers) are user defined. The only way to identify them for grouping is by identifying the naming scheme of the amplitude, but not everyone uses the same scheme.
This class tries to identify the amplitude naming scheme used, and defines a set of possible sums based off the quantum numbers given in the scheme. It currently supports:
JLme- the current recommended generic formateJPmL- used for some vector-pseudosalar analysesLme- common scheme for 2-pseudoscalar analysisbut can be easily extended to other schemes by users.
Updates from previous version
For those using the older standalone version of this script shown in the last tutorial, I figure its worth it to list some key differences:
halld_simnow