class sts_select.scoring.BaseSTSScorer(X, y, X_names=None, y_names=None, cache=None, **kwargs)

Base class for using semantic textual similarity to score. Assumes that we have a simple function that takes two strings and returns a score on a uniform scale.

Parameters:
  • X

  • y

  • X_names

  • y_names

  • cache

  • kwargs

class sts_select.scoring.BaseScorer(X, y, X_names=None, y_names=None, cache=None, verbose=0, **kwargs)

Base scorer class. All scorers should inherit from this class.

Parameters:
  • X – Source data to score.

  • y – Target data to score.

  • X_names – Source names to score.

  • y_names – Target names to score.

  • cache – Cache location for storing scores.

  • kwargs – Additional arguments.

X_score(x1, x2)
Parameters:
  • x1 (int)

  • x2 (int)

X_y_score(x, y)
Parameters:
  • x (int)

  • y (int)

load_cache()

Load the cache.

Returns:

save_cache()

Save the cache.

Returns:

score(X, y)

Score the given X and y.

Parameters:
  • X

  • y

Returns:

scored()

Check if the scorer has already been initialized.

Returns:

class sts_select.scoring.Chi2Scorer(X, y, cache=None, random_state=0, **kwargs)

Scorer for chi-squared (valid for categorical X and y only).

class sts_select.scoring.FScorer(X, y, cache=None, random_state=0, **kwargs)

Scorer for chi-squared.

class sts_select.scoring.GensimScorer(X, y, X_names=None, y_names=None, cache=None, model_path=None, verbose=0, model_type=None, **kwargs)

Scorer for the Gensim library.

Parameters:
  • X – Source data to score (not used).

  • y – Target data to score (not used).

  • X_names – List of strings to score.

  • y_names – List of strings to score.

  • cache – Cache location for storing scores.

  • model_path – Path to the Gensim model.

  • model_type (type)

score(X, y)

Scores the similarity of the Gensim embeddings.

Parameters:
  • X – Source data to score (not used).

  • y – Target data to score (not used).

Returns:

class sts_select.scoring.LinearScorer(X, y, cache=None, **kwargs)
set_params(**params)
class sts_select.scoring.MIScorer(X, y, cache=None, random_state=0, **kwargs)

Scorer for mutual information.

class sts_select.scoring.PearsonsRScorer(X, y, cache=None, random_state=0, **kwargs)

Scorer for Pearson’s r.

class sts_select.scoring.SentenceTransformerScorer(X, y, X_names=None, y_names=None, cache=None, model_path=None, verbose=0, **kwargs)

STS scorer using the SentenceTransformers library.

Parameters:
  • X – Source data to score (not used).

  • y – Target data to score (not used).

  • X_names – List of strings to score.

  • y_names – List of strings to score.

  • cache – Cache location for storing scores.

  • model_path – Path to the SentenceTransformers model.

score(X, y)

Generates the feature-feature and feature-target scores.

Parameters:
  • X – Source data to score (not used).

  • y – Target data to score (not used).

Returns: