Metadata-Version: 2.4
Name: frsutils
Version: 0.0.4
Summary: Fuzzy-rough set utilities for Python.
Author-email: Mehran Amiri <meam64@gmail.com>
Maintainer-email: Mehran Amiri <meam64@gmail.com>
License: BSD-3-Clause
Project-URL: Homepage, https://github.com/mehi64/frsutils
Project-URL: Documentation, https://github.com/mehi64/frsutils#readme
Project-URL: Source, https://github.com/mehi64/frsutils
Project-URL: Issues, https://github.com/mehi64/frsutils/issues
Keywords: fuzzy-rough sets,rough sets,fuzzy sets,positive region,similarity matrix,data science
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.0
Requires-Dist: scikit-learn
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pandas; extra == "dev"
Requires-Dist: openpyxl; extra == "dev"
Requires-Dist: colorlog; extra == "dev"
Requires-Dist: matplotlib; extra == "dev"
Provides-Extra: gpu-cuda12x
Requires-Dist: cupy-cuda12x; extra == "gpu-cuda12x"
Provides-Extra: gpu-cuda11x
Requires-Dist: cupy-cuda11x; extra == "gpu-cuda11x"
Dynamic: license-file

<img title="" src="logo/logo.png" alt="frsutils Logo" width="250">

# frsutils

frsutils is a Python library for reusable fuzzy-rough set utilities. The package focuses on fuzzy-rough core building blocks such as similarity matrices, t-norms, implicators, fuzzy quantifiers, fuzzy-rough models, lower/upper
approximations, and positive-region computation.

# For Developers

If you are extending frsutils, start with the public compatibility boundary in
[`docs/public_api_contract.md`](docs/public_api_contract.md), then follow the
release and documentation checks in
[`docs/release_checklist.md`](docs/release_checklist.md),
[`docs/documentation_smoke_check.md`](docs/documentation_smoke_check.md),
[`docs/submit_readiness_report.md`](docs/submit_readiness_report.md), and
[`docs/joss_final_submission_checklist.md`](docs/joss_final_submission_checklist.md).

# Installation

Install the fuzzy-rough core package:

```bash
pip install frsutils
```

For local development from this repository:

```bash
pip install -e .
```

## Core requirements

frsutils core intentionally keeps the mandatory dependency set small, but the
public API includes an sklearn-style positive-region scorer, so scikit-learn is
part of the runtime contract:

- Python >= 3.10
- NumPy >= 1.21.0
- scikit-learn

## Optional development / dataset / GPU dependencies

Install these only when you need the related workflows:

- `pytest` for tests
- `pandas`, `openpyxl` for some dataset utilities
- `colorlog` for colored logging; frsutils falls back to standard logging if it
  is not installed
- `matplotlib` for plotting examples/tests
- `cupy-cuda12x` or another CUDA-compatible CuPy wheel for explicit
  `backend="cupy"` experiments

# Fuzzy-rough set utilities

frsutils provides reusable fuzzy-rough set calculations used in research,
including:

- lower approximation
- upper approximation
- positive region
- boundary region

## Public API quickstart

The canonical user-facing API is the package root, `frsutils`. End users, notebooks, examples, and downstream packages should import from this namespace instead of importing directly from internal `frsutils.core` or `frsutils.utils` modules. As an example:

```python
from frsutils import compute_approximations
```

The package root, `frsutils`, exposes the intended stable public objects while keeping internal implementation details out of the public contract.

The smallest workflow is to prepare normalized numeric data, compute fuzzy-rough approximations, and read the named fields from the returned result object.

### Example:

```python
import numpy as np

from frsutils import compute_approximations, compute_positive_region

# frsutils expects numeric feature values on a comparable scale. In real
# experiments, normalize or scale your data before calling the fuzzy-rough API.
X = np.array(
    [
        [0.00, 0.10],
        [0.08, 0.18],
        [0.15, 0.12],
        [0.80, 0.82],
        [0.88, 0.90],
        [0.95, 0.86],
    ],
    dtype=float,
)
y = np.array([0, 0, 0, 1, 1, 1], dtype=int)

result = compute_approximations(
    X,
    y,
    model="itfrs",
    similarity="linear",
)

print("lower approximation:", result.lower)
print("upper approximation:", result.upper)
print("boundary region:", result.boundary)
print("positive region:", result.positive_region)

# Shortcut when only positive-region scores are needed.
scores = compute_positive_region(
    X,
    y,
    model="itfrs",
    similarity="linear",
)
print("positive-region scores:", scores)
```

For reusable fitted scoring workflows, use the sklearn-style positive-region scorer:

```python
import numpy as np

from frsutils import FuzzyRoughPositiveRegionScorer

X = np.array(
    [
        [0.00, 0.10],
        [0.08, 0.18],
        [0.15, 0.12],
        [0.80, 0.82],
        [0.88, 0.90],
        [0.95, 0.86],
    ],
    dtype=float,
)
y = np.array([0, 0, 0, 1, 1, 1], dtype=int)

scorer = FuzzyRoughPositiveRegionScorer(
    model="owafrs",
    similarity="linear",
)

scores = scorer.fit_score(X, y)
result = scorer.as_result()

print(scores)
print(result.lower)
print(result.upper)
```

Downstream packages can reuse a precomputed similarity matrix through the public API without importing from frsutils internals:

```python
import numpy as np

from frsutils import build_similarity_matrix, compute_positive_region

X = np.array(
    [
        [0.00, 0.10],
        [0.08, 0.18],
        [0.15, 0.12],
        [0.80, 0.82],
        [0.88, 0.90],
        [0.95, 0.86],
    ],
    dtype=float,
)
y = np.array([0, 0, 0, 1, 1, 1], dtype=int)

similarity_matrix = build_similarity_matrix(X, similarity="linear")

scores = compute_positive_region(
    X=None,
    y=y,
    model="itfrs",
    similarity_matrix=similarity_matrix,
)

print(scores)
```

A runnable version of the quickstart is available at
[`examples/public_api_quickstart.py`](examples/public_api_quickstart.py). See [`docs/public_api.md`](docs/public_api.md) for the public API guide.

## Execution engines and backend status

frsutils now exposes dense and exact blockwise execution through the public API.
Dense mode preserves the historical full-matrix behavior. Blockwise mode avoids materializing the full `n x n` similarity matrix for approximation computation and is available for ITFRS, VQRS, and OWAFRS.

```python
from frsutils import compute_approximations

result = compute_approximations(
    X,
    y,
    model="itfrs",
    similarity="linear",
    engine="blockwise",
    block_size=512,
    backend="numpy",
)
```

`backend="cupy"` is an optional experimental backend for GPU-accelerated similarity-block computation. For `model="itfrs"` and `model="vqrs"` with `engine="blockwise"`, approximation reductions/accumulators can also stay CuPy-resident until final public NumPy output conversion. OWAFRS deliberately remains on the conservative NumPy row-buffer path after the OWAFRS non-GPU-resident decision because exact OWA execution requires row-wise sorting and a separate memory/sorting benchmark. Do not claim full GPU-native fuzzy-rough execution yet. See [`docs/backend_execution_status.md`](docs/backend_execution_status.md) and [`docs/owafrs_non_gpu_resident_decision.md`](docs/owafrs_non_gpu_resident_decision.md).

The returned result records execution provenance so benchmark scripts and downstream packages can verify which path was used:

```python
result.engine                      # "dense" or "blockwise"
result.backend                     # "numpy" or resolved optional backend
result.block_size                  # None for dense; integer for blockwise
result.used_blockwise              # bool
result.used_gpu_similarity_blocks          # bool
result.used_gpu_approximation_accumulators # bool, true for CuPy blockwise ITFRS/VQRS; false for OWAFRS
```

The sklearn-style `FuzzyRoughPositiveRegionScorer` accepts the same `engine`, `backend`, and `block_size` parameters.

## Benchmark suite

frsutils includes a reproducible benchmark harness for the public approximation API:

```bash
python benchmarks/benchmark_fuzzy_rough_execution.py     --models itfrs,vqrs,owafrs     --sample-sizes 128,256,512     --n-features 8     --block-sizes 64,128     --scenarios dense_numpy,blockwise_numpy,blockwise_cupy     --repeats 3     --output-json benchmark_results.json     --output-csv benchmark_results.csv
```

The suite compares dense NumPy, exact blockwise NumPy, and optional CuPy-backed blockwise execution. It records runtime, lightweight Python allocator peak memory, dense-reference numerical-equivalence errors, and public execution metadata. CuPy/CUDA-unavailable rows are reported as skipped. See [`docs/benchmark_suite.md`](docs/benchmark_suite.md).

## Release-ready examples and paper claim boundary

The repository includes small release-ready examples:

```bash
python examples/public_api_quickstart.py
python examples/benchmark_smoke.py --output-dir benchmark_smoke_output
```

Use the wording in [`docs/paper_claims.md`](docs/paper_claims.md) when describing
frsutils in a release note, software paper, or benchmark report. The safe claim
is that frsutils provides dense and exact blockwise fuzzy-rough approximation
APIs, optional CuPy-accelerated similarity blocks, and experimental
CuPy-resident blockwise approximation accumulators for ITFRS/VQRS. Public
outputs remain NumPy arrays, and OWAFRS remains on the conservative exact
blockwise NumPy row-buffer path in this release.

Before tagging or submitting, use
[`docs/release_checklist.md`](docs/release_checklist.md) and the
[`documentation smoke check`](docs/documentation_smoke_check.md).

## Algorithms and contents

- Similarities (See [fuzzy similarities](docs/similarities_info.md))
  - Linear
  - Gaussian
- Implicators (See [fuzzy implicators](docs/implicators_info.md))
  - Lukasiewicz
  - Goedel
  - Reichenbach
  - Kleene-Dienes
  - Goguen
  - Yager
  - Rescher
  - Weber
  - Fodor
- T-norms (See [fuzzy tnorms](docs/tnorms_info.md))
  - Min tnorm
  - Product tnorm
  - Lukasiewicz tnorm
  - Yager tnorm
  - DrasticProduct tnorm
  - EinsteinProduct tnorm
  - HamacherProduct tnorm
  - NilpotentMinimum tnorm
- OWA weights (Ordered Weighted Average) (See [OWA](docs/owa_weights_info.md))
  - Linear
  - Exponential
  - Harmonic
  - Logarithmic
- Fuzzy quantifiers
  - Linear
  - Quadratic
- FR Models
  - ITFRS (See [Implicator/T-norm Fuzzy-Rough Sets](docs/itfrs_info.md))
  - OWAFRS (See [Ordered Weighted Average Fuzzy-Rough Sets](docs/owafrs_info.md))
  - VQRS (See [Vaguely Quantified fuzzy-Rough Sets](docs/vqrs_info.md))

## Fuzzy-rough oversampling boundary

Fuzzy-rough oversampling algorithms are no longer part of frsutils core. They
live in the standalone `frsampling` package, which depends on
frsutils through the public `frsutils` namespace. frsutils intentionally does
not provide old FRSMOTE compatibility wrappers.

```text
frsampling  --->  frsutils
frsutils    -X->  frsampling
```

frsutils should be cited/used as the fuzzy-rough core engine: similarities,
t-norms, implicators, fuzzy quantifiers, approximation models, lower/upper
approximation, and positive region. Oversampling algorithms such as FRSMOTE and
future FRADASYN belong to the downstream oversampling package.

## Notes and assumptions

- All functions expect normalized scalar values or normalized NumPy arrays.
- Make sure the input dataset is normalized. This library expects numeric inputs
  used by fuzzy-rough computations to be in the range [0, 1].
- This library uses all features of data instances to calculate fuzzy-rough
  measures.
- Positive region, lower approximation, upper approximation, etc. are calculated
  based on the class of each instance.

## Docs

- We use compact NumPy-style Python docstrings and keep longer examples in README/docs.
- To see online documentation, please visit
  [online documentation](https://mehi64.github.io/frsutils/).

## How to run tests

From the repository root, the default test command excludes tests marked as
`slow` via `pyproject.toml`:

```bash
python -m pytest tests -q
```

to run all tests, including slow-marked ones:

```bash
python -m pytest tests benchmarks -m "slow or not slow" -vv -rs
```

Run the documented quickstart and release/backend smoke set explicitly with:

```bash
python examples/public_api_quickstart.py
python -m pytest tests/api/test_public_api_examples_smoke.py -q -rs
python -m pytest tests/api tests/core_tests/test_approximation_engines.py -q -rs
```

See [`docs/release_validation_commands.md`](docs/release_validation_commands.md)
for the complete validation command list.

Run exhaustive slow model-combination tests separately when needed:

```bash
python -m pytest tests/models_tests -m slow -o addopts="" -q
```

For the standalone oversampling package, run from the `frsampling` repository
root after making frsutils importable:

```bash
PYTHONPATH="$PWD/src:../frsutils" python -m pytest tests -q
```

For more information on test procedures, please refer to
[test procedures](tests/test_procedures.md).

## Technical decisions justification

- Since data checking can slow down experiments, heavy numeric functions do not
  perform repeated input-range checks. Validation is preferred at construction or
  workflow boundaries.

## Maintenance notes

- Exhaustive model-combination tests are marked as `slow`; run them explicitly
  with `python -m pytest tests/models_tests -m slow -o addopts="" -q`.
- VQRS is implemented and covered by the public API/blockwise/backend tests.
- New feature work should be deferred until the release/paper cleanup checklist is
  complete.

## License

This project is licensed under the BSD-3-Clause License. See the [LICENSE](./LICENSE)
file for details. The package metadata, citation metadata, and Python source
headers use the same `BSD-3-Clause` identifier.

## How to cite us in your research papers

If you use this library in your research, please cite the software metadata in
[`CITATION.cff`](./CITATION.cff). After the JOSS paper is accepted, cite the
JOSS paper DOI as the preferred citation.

**APA**:

> Mehran Amiri. (*2026*). *frsutils: Fuzzy-Rough Set Utilities for Python*
> (Version 0.0.3) [Computer software]. https://github.com/mehi64/frsutils

**BibTeX**:

```bibtex
@software{Amiri_frsutils_2026,
  author = {Amiri, Mehran},
  title = {frsutils: Fuzzy-Rough Set Utilities for Python},
  url = {https://github.com/mehi64/frsutils},
  version = {0.0.3},
  year = {2026}
}
```
