#
# stub function definition file for docstring parsing
#
[docs]def partition(vis, outputvis='', createmms=True, separationaxis='auto', numsubms='auto', flagbackup=True, datacolumn='all', field='', spw='', scan='', antenna='', correlation='', timerange='', intent='', array='', uvrange='', observation='', feed='', disableparallel=False, ddistart=-1, taql=''):
r"""
Task to produce Multi-MSs using parallelism
[`Description`_] [`Examples`_] [`Development`_] [`Details`_]
Parameters
- vis_ (path) - Name of input measurement set
- outputvis_ (string='') - Name of output measurement set
- createmms_ (bool=True) - Should this create a multi-MS output
.. raw:: html
<details><summary><i> createmms = True </i></summary>
- separationaxis_ (string='auto') - Axis to do parallelization across(scan, spw, baseline, auto)
- numsubms_ ({string, int}='auto') - The number of SubMSs to create (auto or any number)
- flagbackup_ (bool=True) - Create a backup of the FLAG column in the MMS.
- disableparallel_ (bool=False) - Create a multi-MS in parallel.
- ddistart_ (int=-1) - Do not change this parameter. For internal use only.
- taql_ (string='') - Table query for nested selections
.. raw:: html
</details>
.. raw:: html
<details><summary><i> createmms = False </i></summary>
- separationaxis_ (string='auto') - Axis to do parallelization across(scan, spw, baseline, auto)
- numsubms_ ({string, int}='auto') - The number of SubMSs to create (auto or any number)
- flagbackup_ (bool=True) - Create a backup of the FLAG column in the MMS.
- disableparallel_ (bool=False) - Create a multi-MS in parallel.
- ddistart_ (int=-1) - Do not change this parameter. For internal use only.
- taql_ (string='') - Table query for nested selections
.. raw:: html
</details>
- datacolumn_ (string='all') - Which data column(s) to process.
- field_ ({string, stringVec, int, intVec}='') - Select field using ID(s) or name(s).
- spw_ ({string, stringVec, int, intVec}='') - Select spectral window/channels.
- scan_ ({string, stringVec, int, intVec}='') - Select data by scan numbers.
- antenna_ ({string, stringVec, int, intVec}='') - Select data based on antenna/baseline.
- correlation_ ({string, stringVec}='') - Correlation: '' ==> all, correlation="XX,YY".
- timerange_ ({string, stringVec, int, intVec}='') - Select data by time range.
- intent_ ({string, stringVec, int, intVec}='') - Select data by scan intent.
- array_ ({string, stringVec, int, intVec}='') - Select (sub)array(s) by array ID number.
- uvrange_ ({string, stringVec, int, intVec}='') - Select data by baseline length.
- observation_ ({string, stringVec, int, intVec}='') - Select by observation ID(s).
- feed_ ({string, stringVec, int, intVec}='') - Multi-feed numbers: Not yet implemented.
.. _Description:
Description
partition is a task that creates
a `Multi-MS <../../notebooks/parallel-processing.ipynb#The-Multi-MS>`__ out
of a MeasurementSet. General selection parameters are included,
and one or all of the various data columns (DATA, FLAG_DATA and/or
FLOAT_DATA, and possibly MODEL_DATA and/or CORRECTED_DATA) can be
selected.
The partition task creates a Multi-MS in parallel using the Message
Passing Interface ( `MPI <http://mpi-forum.org/>`__ ), enabled via
the `casampi
<../../notebooks/parallel-processing.ipynb#Advanced:-Interface-Framework>`__
framework.
.. note:: When partition or any other task processes an MMS in parallel,
each Sub-MS is processed independently in a parallel
engine. The log messages of the parallel engines are identified
by the string MPIServer- #, where # gives the number of the
engine running that process. When the task runs sequentially,
it shows the MPIClient text in the origin of the log messages
or does not show anything.
.. rubric:: Parameter Descriptions
*vis*
Name of input MeasurementSet.
*outputvis*
Name of output Multi-MS.
*createmms*
By default, this parameter is set to True to create an output
Multi-MS, which is the basic step for running CASA in parallel.
See more about this in the
`Parallelization <../../notebooks/parallel-processing.ipynb>`__
chapter. The task will obey the settings of the parameters listed
below if set to True. If set to False, it will work as the
**split** task and create a normal MS, split according to the
given data selection parameters. Note that, when this parameter is
set to False, a cluster will not be used.
*separationaxis*
Axis to do parallelization across. Namely, it is how the MS will
be partitioned to form separated entities, called Sub-MSs.
**partition** accepts four axes to do separation across: ’auto’,
’scan’, ’spw’ or ’baseline’. The default is set to 'auto',
which will first separate the MS in spws, then in scans. It tries
to balance the spw and scan contents in each Sub-MS, also taking
into account the available fields so that the size in disk is also
balanced. This is the recommended axis to partition an MS.
- The 'auto' option will partition the MS per scan and spw to
obtain optimal load balancing with the following criteria:
1. Maximize the scan/spw/field distribution across sub-MSs
2. Generate sub-MSs with similar size
- The 'scan' or 'spw' axes will partition the MS based on scans
or spws. The individual sub-MSs may not be balanced with
respect to the number of rows.
- The 'baseline' axis is mostly useful for Single-Dish data. This
axis will partition the MS based on the available baselines. If
the user wants only auto-correlations, use the antenna
selection such as antenna='\*&&&' together with this separation
axis. Note thatif numsubms='auto', partition will try to create
as many sub-MSs as the number of available servers in the
cluster. If the user wants to have one sub-MS for each
baseline, set the numsubms parameter to a number higher than
the number of baselines to achieve this.
*numsubms*
The number of sub-MSs to create in the Multi-MS. The default
'auto' is to partition the MS using the number of available
servers in the cluster. If the task is unable to determine the
number of running servers, or the user did not start CASA using
mpicasa, numsubms will be set to 8 Sub-MSs as default. The user
can create any number of Sub-MSs, regardless of the number of
cores used to create the cluster with mpicasa.
*flagbackup*
Make a backup of the FLAG column of the output MMS. When the MMS
is created, the `flag
versions <../../notebooks/data_examination.ipynb#Manage-flag-versions>`__ (the
.flagversions file) of the input MS are not transferred; therefore
it is necessary to re-create it for the new MMS. Note that
multiple backups from the input MS will not be preserved.
This will create a single backup of all the flags present in the
input MS at the time the MMS is created.
.. _Examples:
Examples
Other examples of running CASA in parallel can be
found `here <../../notebooks/parallel-processing.ipynb#Examples-parallelization>`__ .
Use task listpartition to see the content of the Multi-MS.
.. rubric:: Start CASA on a single node with 16 engines
The first engine will be used as the MPIClient, where the user
will see the CASA prompt. All other engines will be used as
MPIServers and will process the MS in parallel.
::
mpicasa -n 16 casa --nogui --log2term
partition(vis='uid__A1__X33993.ms', outputvis='test.mms')
.. rubric:: Run CASA on a group of nodes in a cluster
::
mpicasa -hostfile user_hostfile casa ....
partition(.....)
where user_hostfile contains the names of the nodes and the number
of engines to use in each one of them. Example:
.. code::
cvpost001, slots=5
cvpost002, slots=4
.. rubric:: Create a Multi-MS of selected spws, partitioned per spw
The first example will create 4 Sub-MSs by default, if CASA is
started with 5 engines. In the second example, use the numsubms
parameter to force the creation of 8 Sub-MSs, with one spw per
Sub-MS.
::
mpicasa -n 5 casa ...
# Ex 1: The following example will create 4 Sub-MSs by default
partition('uid001.ms', outpuvis='source.mms',
spw='1,3,5,7,9,11,13,15', separationaxis='spw')
# *Ex 2: force the creation of one spw per Sub-MS*
partition('uid001.ms', outpuvis='source.mms',
spw='1,3,5,7,9,11,13,15', separationaxis='spw', numsubms=8)
.. rubric:: Create a Multi-MS with only a certain channel range of all spws but do not back up the FLAG column
::
partition('uid0001.ms', outputvis='fewchans.mms', spw='\*:1~10',
flagbackup=False)
.. rubric:: Create a single-dish Multi-MS using the baseline axis only for auto-correlations
::
partition('uid0001.ms', outputvis='myuid.ms', createmms=True,
separationaxis='baseline', antenna='\*&&&')
.. note:: NOTE: If CASA is started without mpicasa, it is still possible to create an MMS, but the processing will be done in serial.
.. _Development:
Development
No additional development details
.. _Details:
Parameter Details
Detailed descriptions of each function parameter
.. _vis:
| ``vis (path)`` - Name of input measurement set
.. _outputvis:
| ``outputvis (string='')`` - Name of output measurement set
.. _createmms:
| ``createmms (bool=True)`` - Should this create a multi-MS output
.. _separationaxis:
| ``separationaxis (string='auto')`` - Axis to do parallelization across(scan, spw, baseline, auto)
.. _numsubms:
| ``numsubms ({string, int}='auto')`` - The number of SubMSs to create (auto or any number)
.. _flagbackup:
| ``flagbackup (bool=True)`` - Create a backup of the FLAG column in the MMS.
.. _datacolumn:
| ``datacolumn (string='all')`` - Which data column(s) to process.
.. _field:
| ``field ({string, stringVec, int, intVec}='')`` - Select field using ID(s) or name(s).
.. _spw:
| ``spw ({string, stringVec, int, intVec}='')`` - Select spectral window/channels.
.. _scan:
| ``scan ({string, stringVec, int, intVec}='')`` - Select data by scan numbers.
.. _antenna:
| ``antenna ({string, stringVec, int, intVec}='')`` - Select data based on antenna/baseline.
.. _correlation:
| ``correlation ({string, stringVec}='')`` - Correlation: '' ==> all, correlation="XX,YY".
.. _timerange:
| ``timerange ({string, stringVec, int, intVec}='')`` - Select data by time range.
.. _intent:
| ``intent ({string, stringVec, int, intVec}='')`` - Select data by scan intent.
.. _array:
| ``array ({string, stringVec, int, intVec}='')`` - Select (sub)array(s) by array ID number.
.. _uvrange:
| ``uvrange ({string, stringVec, int, intVec}='')`` - Select data by baseline length.
.. _observation:
| ``observation ({string, stringVec, int, intVec}='')`` - Select by observation ID(s).
.. _feed:
| ``feed ({string, stringVec, int, intVec}='')`` - Multi-feed numbers: Not yet implemented.
.. _disableparallel:
| ``disableparallel (bool=False)`` - Create a multi-MS in parallel.
.. _ddistart:
| ``ddistart (int=-1)`` - Do not change this parameter. For internal use only.
.. _taql:
| ``taql (string='')`` - Table query for nested selections
"""
pass