Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7262][ML] Binary LogisticRegression with L1/L2 (elastic net) using OWLQN in new ML package #5967

Closed
wants to merge 10 commits into from

Conversation

dbtsai
Copy link
Member

@dbtsai dbtsai commented May 7, 2015

  1. Handle scaling and addBias internally.
  2. L1/L2 elasticnet using OWLQN optimizer.

@SparkQA
Copy link

SparkQA commented May 7, 2015

Test build #32083 has finished for PR 5967 at commit 34705bc.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32221 has finished for PR 5967 at commit 8ec65d2.

  • This patch fails Scala style tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32222 timed out for PR 5967 at commit a784321 after a configured wait of 150m.

@mengxr
Copy link
Contributor

mengxr commented May 8, 2015

test this please

* Two MultilabelSummarizer can be merged together to have a statistical summary of the
* corresponding joint dataset.
*/
class MultiClassSummarizer private[ml] extends Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private class?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to test it in the suite.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we can mark it package private. This shouldn't be a public class.

@mengxr
Copy link
Contributor

mengxr commented May 8, 2015

@dbtsai I made a pass and it looks good to except some minor inline comments. Could you address the comments today? It should be good to go then.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32244 has finished for PR 5967 at commit a784321.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32249 has finished for PR 5967 at commit f98e711.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@dbtsai
Copy link
Member Author

dbtsai commented May 8, 2015

test this please

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32257 has finished for PR 5967 at commit 5c31824.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class Binarizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class IDF(JavaEstimator, HasInputCol, HasOutputCol):
    • class IDFModel(JavaModel):
    • class Normalizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class OneHotEncoder(JavaTransformer, HasInputCol, HasOutputCol):
    • class PolynomialExpansion(JavaTransformer, HasInputCol, HasOutputCol):
    • class RegexTokenizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class StandardScaler(JavaEstimator, HasInputCol, HasOutputCol):
    • class StandardScalerModel(JavaModel):
    • class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol):
    • class StringIndexerModel(JavaModel):
    • class Tokenizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class VectorIndexer(JavaEstimator, HasInputCol, HasOutputCol):
    • class Word2Vec(JavaEstimator, HasStepSize, HasMaxIter, HasSeed, HasInputCol, HasOutputCol):
    • class Word2VecModel(JavaModel):
    • class HasSeed(Params):
    • class HasTol(Params):
    • class HasStepSize(Params):
    • case class UnresolvedExtractValue(child: Expression, extraction: Expression)
    • trait ExtractValue extends UnaryExpression
    • case class GetStructField(child: Expression, field: StructField, ordinal: Int)
    • case class GetArrayStructFields(
    • abstract class ExtractValueWithOrdinal extends ExtractValue
    • case class GetArrayItem(child: Expression, ordinal: Expression)
    • case class GetMapValue(child: Expression, ordinal: Expression)

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32252 has finished for PR 5967 at commit c84e931.

  • This patch passes all tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@mengxr
Copy link
Contributor

mengxr commented May 8, 2015

test this please

@dbtsai
Copy link
Member Author

dbtsai commented May 8, 2015

test this please

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32273 has finished for PR 5967 at commit fa029bb.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class Binarizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class IDF(JavaEstimator, HasInputCol, HasOutputCol):
    • class IDFModel(JavaModel):
    • class Normalizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class OneHotEncoder(JavaTransformer, HasInputCol, HasOutputCol):
    • class PolynomialExpansion(JavaTransformer, HasInputCol, HasOutputCol):
    • class RegexTokenizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class StandardScaler(JavaEstimator, HasInputCol, HasOutputCol):
    • class StandardScalerModel(JavaModel):
    • class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol):
    • class StringIndexerModel(JavaModel):
    • class Tokenizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class VectorIndexer(JavaEstimator, HasInputCol, HasOutputCol):
    • class Word2Vec(JavaEstimator, HasStepSize, HasMaxIter, HasSeed, HasInputCol, HasOutputCol):
    • class Word2VecModel(JavaModel):
    • class HasSeed(Params):
    • class HasTol(Params):
    • class HasStepSize(Params):
    • case class UnresolvedExtractValue(child: Expression, extraction: Expression)
    • trait ExtractValue extends UnaryExpression
    • case class GetStructField(child: Expression, field: StructField, ordinal: Int)
    • case class GetArrayStructFields(
    • abstract class ExtractValueWithOrdinal extends ExtractValue
    • case class GetArrayItem(child: Expression, ordinal: Expression)
    • case class GetMapValue(child: Expression, ordinal: Expression)

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32274 has finished for PR 5967 at commit fa029bb.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32261 has finished for PR 5967 at commit 5c31824.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class UnresolvedExtractValue(child: Expression, extraction: Expression)
    • trait ExtractValue extends UnaryExpression
    • case class GetStructField(child: Expression, field: StructField, ordinal: Int)
    • case class GetArrayStructFields(
    • abstract class ExtractValueWithOrdinal extends ExtractValue
    • case class GetArrayItem(child: Expression, ordinal: Expression)
    • case class GetMapValue(child: Expression, ordinal: Expression)

@SparkQA
Copy link

SparkQA commented May 8, 2015

Test build #32268 has finished for PR 5967 at commit 5c31824.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • public class EnumUtil
    • class Binarizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class IDF(JavaEstimator, HasInputCol, HasOutputCol):
    • class IDFModel(JavaModel):
    • class Normalizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class OneHotEncoder(JavaTransformer, HasInputCol, HasOutputCol):
    • class PolynomialExpansion(JavaTransformer, HasInputCol, HasOutputCol):
    • class RegexTokenizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class StandardScaler(JavaEstimator, HasInputCol, HasOutputCol):
    • class StandardScalerModel(JavaModel):
    • class StringIndexer(JavaEstimator, HasInputCol, HasOutputCol):
    • class StringIndexerModel(JavaModel):
    • class Tokenizer(JavaTransformer, HasInputCol, HasOutputCol):
    • class VectorIndexer(JavaEstimator, HasInputCol, HasOutputCol):
    • class Word2Vec(JavaEstimator, HasStepSize, HasMaxIter, HasSeed, HasInputCol, HasOutputCol):
    • class Word2VecModel(JavaModel):
    • class HasSeed(Params):
    • class HasTol(Params):
    • class HasStepSize(Params):
    • case class UnresolvedExtractValue(child: Expression, extraction: Expression)
    • trait ExtractValue extends UnaryExpression
    • case class GetStructField(child: Expression, field: StructField, ordinal: Int)
    • case class GetArrayStructFields(
    • abstract class ExtractValueWithOrdinal extends ExtractValue
    • case class GetArrayItem(child: Expression, ordinal: Expression)
    • case class GetMapValue(child: Expression, ordinal: Expression)

@mengxr
Copy link
Contributor

mengxr commented May 8, 2015

test this please

@SparkQA
Copy link

SparkQA commented May 9, 2015

Test build #32278 has finished for PR 5967 at commit fa029bb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

asfgit pushed a commit that referenced this pull request May 9, 2015
…using OWLQN in new ML package

1) Handle scaling and addBias internally.
2) L1/L2 elasticnet using OWLQN optimizer.

Author: DB Tsai <dbt@netflix.com>

Closes #5967 from dbtsai/lor and squashes the following commits:

fa029bb [DB Tsai] made the bound smaller
0806002 [DB Tsai] better initial intercept and more test
5c31824 [DB Tsai] fix import
c387e25 [DB Tsai] Merge branch 'master' into lor
c84e931 [DB Tsai] Made MultiClassSummarizer private
f98e711 [DB Tsai] address feedback
a784321 [DB Tsai] fix style
8ec65d2 [DB Tsai] remove new line
f3f8c88 [DB Tsai] add more tests and they match R which is good. fix a bug
34705bc [DB Tsai] first commit

(cherry picked from commit 86ef4cf)
Signed-off-by: Xiangrui Meng <meng@databricks.com>
@mengxr
Copy link
Contributor

mengxr commented May 9, 2015

Merged into master and branch-1.4. Thanks!

@asfgit asfgit closed this in 86ef4cf May 9, 2015
@dbtsai dbtsai deleted the lor branch May 10, 2015 09:24
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request May 28, 2015
…using OWLQN in new ML package

1) Handle scaling and addBias internally.
2) L1/L2 elasticnet using OWLQN optimizer.

Author: DB Tsai <dbt@netflix.com>

Closes apache#5967 from dbtsai/lor and squashes the following commits:

fa029bb [DB Tsai] made the bound smaller
0806002 [DB Tsai] better initial intercept and more test
5c31824 [DB Tsai] fix import
c387e25 [DB Tsai] Merge branch 'master' into lor
c84e931 [DB Tsai] Made MultiClassSummarizer private
f98e711 [DB Tsai] address feedback
a784321 [DB Tsai] fix style
8ec65d2 [DB Tsai] remove new line
f3f8c88 [DB Tsai] add more tests and they match R which is good. fix a bug
34705bc [DB Tsai] first commit
jeanlyn pushed a commit to jeanlyn/spark that referenced this pull request Jun 12, 2015
…using OWLQN in new ML package

1) Handle scaling and addBias internally.
2) L1/L2 elasticnet using OWLQN optimizer.

Author: DB Tsai <dbt@netflix.com>

Closes apache#5967 from dbtsai/lor and squashes the following commits:

fa029bb [DB Tsai] made the bound smaller
0806002 [DB Tsai] better initial intercept and more test
5c31824 [DB Tsai] fix import
c387e25 [DB Tsai] Merge branch 'master' into lor
c84e931 [DB Tsai] Made MultiClassSummarizer private
f98e711 [DB Tsai] address feedback
a784321 [DB Tsai] fix style
8ec65d2 [DB Tsai] remove new line
f3f8c88 [DB Tsai] add more tests and they match R which is good. fix a bug
34705bc [DB Tsai] first commit
nemccarthy pushed a commit to nemccarthy/spark that referenced this pull request Jun 19, 2015
…using OWLQN in new ML package

1) Handle scaling and addBias internally.
2) L1/L2 elasticnet using OWLQN optimizer.

Author: DB Tsai <dbt@netflix.com>

Closes apache#5967 from dbtsai/lor and squashes the following commits:

fa029bb [DB Tsai] made the bound smaller
0806002 [DB Tsai] better initial intercept and more test
5c31824 [DB Tsai] fix import
c387e25 [DB Tsai] Merge branch 'master' into lor
c84e931 [DB Tsai] Made MultiClassSummarizer private
f98e711 [DB Tsai] address feedback
a784321 [DB Tsai] fix style
8ec65d2 [DB Tsai] remove new line
f3f8c88 [DB Tsai] add more tests and they match R which is good. fix a bug
34705bc [DB Tsai] first commit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants