libuplift.datasets.pbc
======================

.. py:module:: libuplift.datasets.pbc

.. autoapi-nested-parse::

   The pbc datasets from R survival package.

   ..
       !! processed by numpydoc !!


Functions
---------

.. autoapisummary::

   libuplift.datasets.pbc.fetch_pbc


Module Contents
---------------

.. py:function:: fetch_pbc(data_home=None, download_if_missing=True, random_state=None, shuffle=False, categ_as_strings=False, return_X_y=False, as_frame=False)

   
   Load the pbc dataset from R survival package (uplift survival).

   Download it if necessary.

   Only first 312 records with assigned treatment are kept.

   Following the original dataset, the edema variable is numerical
       but can also be treated as categorical: 0 no edema, 0.5
       untreated or successfully treated, 1 edema despite diuretic
       therapy

   **Variables**

   chol, copper, trig, platelet contain missing data

   :Parameters:

       **data_home** : string, optional
           Specify another download and cache folder for the datasets. By default
           all scikit-learn data is stored in '~/scikit_learn_data' subfolders.

       **download_if_missing** : boolean, default=True
           If False, raise a IOError if the data is not locally available
           instead of trying to download the data from the source site.

       **random_state** : int, RandomState instance or None (default)
           Determines random number generation for dataset shuffling. Pass an int
           for reproducible output across multiple function calls.

       **shuffle** : bool, default=False
           Whether to shuffle dataset.

       **categ_as_strings** : bool, default=False
           Whether to return categorical variables as strings.

       **return_X_y** : boolean, default=False.
           If True, returns ``(data.data, data.target)`` instead of a Bunch
           object.

       **as_frame** : boolean, default=False
           If True features are returned as pandas DataFrame.  If False
           features are returned as object or float array.  Float array
           is returned if all features are floats.


   :Returns:

       **dataset** : dict-like object with the following attributes:
           ..

       **dataset.data** : numpy array
           Each row corresponds to the features in the dataset.

       **dataset.target_status** : numpy array
           Censoring status: 0=censored, 1=transplant, 2=dead.

       **dataset.target_time** : numpy array
           Censoring, transplant or death time.

       **dataset.DESCR** : string
           Description of the dataset.

       **(data, target_time, target_status)** : tuple if
           ``return_X_y`` is True


   ..
       !! processed by numpydoc !!