程序员最近都爱上了这个网站  程序员们快来瞅瞅吧!  it98k网:it98k.com

本站消息

站长简介/公众号

  出租广告位,需要合作请联系站长

+关注
已关注

分类  

暂无分类

标签  

暂无标签

日期归档  

暂无数据

Numba错误“序列项0:预期的str实例,找到类型”

发布于2020-01-16 22:12     阅读(1208)     评论(0)     点赞(27)     收藏(5)


我想在多元回归分析中选择变量。我尝试使用此代码http://planspace.org/20150423-forward_selection_with_statsmodels/问题是我想从50个变量中进行选择,这花费了太多时间。我使用Numba使其变得更快,并编写了以下代码:

@jit
def forward_selected(data, response):
"""Linear model designed by forward selection.

Parameters:
-----------
data : pandas DataFrame with all possible predictors and response

response: string, name of response column in data

Returns:
--------
model: an "optimal" fitted statsmodels linear model
       with an intercept
       selected by forward selection
       evaluated by adjusted R-squared
"""
remaining = set(data.columns)
remaining.remove(response)
selected = [str]
current_score, best_new_score = 0.0, 0.0
while remaining and current_score == best_new_score:
    scores_with_candidates = [str]
    for candidate in remaining:
        formula = "{} ~ {} + 1".format(response,
                                       ' + '.join(selected + [candidate]))
        score = smf.ols(formula, data).fit().rsquared_adj
        scores_with_candidates.append((score, candidate))
    scores_with_candidates.sort()
    best_new_score, best_candidate = scores_with_candidates.pop()
    if current_score < best_new_score:
        remaining.remove(best_candidate)
        selected.append(best_candidate)
        current_score = best_new_score
formula = "{} ~ {} + 1".format(response,
                               ' + '.join(selected))
model = smf.ols(formula, data).fit()
return model

model = forward_selected(df, col)

但它返回以下错误:

TypeError:序列项0:预期的str实例,找到类型

请告诉我如何解决它。如果您不明白我的问题,我们将很乐意在评论中提供更多信息。

追溯(最近一次通话):

在第164行的文件“〜/ PycharmProjects / anacondaenv / touhu_1.py”

提交=预测(col)

预测中的文件“〜/ PycharmProjects / anacondaenv / touhu_1.py”,第75行

模型= forward_selected(df,col)TypeError:序列项0:预期的str实例,找到类型


解决方案


我认为,查看是否numba确实可以用作助推器的最佳方法之一是尝试njit使用jit装饰器而不是装饰器。njit强制no-python-mode并中断,如果有任何事情落到python上(它根本不提供速度优势)。简短答案:请勿使用任何东西np.ndarrays因此,没有串,没有元组,没有列表和NO调用未即时编译功能。

因此,我更正了以下错误:numba不允许在主体函数主体中使用空列表...不知道为什么(可能是错误?!),但是如果将其移入while块中,则可以使用。

import statsmodels.formula.api as smf
import numba as nb

@nb.jit
def forward_selected_nojit(data, response):
    """Linear model designed by forward selection.

    Parameters:
    -----------
    data : pandas DataFrame with all possible predictors and response

    response: string, name of response column in data

    Returns:
    --------
    model: an "optimal" fitted statsmodels linear model
           with an intercept
           selected by forward selection
           evaluated by adjusted R-squared
    """
    remaining = set(data.columns)
    remaining.remove(response)
    selected = None  # Changed this line
    current_score, best_new_score = 0.0, 0.0
    while remaining and current_score == best_new_score:
        if selected is None:  # Changed this and next line
            selected = []
        scores_with_candidates = []
        for candidate in remaining:
            formula = "{} ~ {} + 1".format(response,
                                           ' + '.join(selected + [candidate]))
            score = smf.ols(formula, data).fit().rsquared_adj
            scores_with_candidates.append((score, candidate))
        scores_with_candidates.sort()
        best_new_score, best_candidate = scores_with_candidates.pop()
        if current_score < best_new_score:
            remaining.remove(best_candidate)
            selected.append(best_candidate)
            current_score = best_new_score
    formula = "{} ~ {} + 1".format(response,
                                   ' + '.join(selected))
    model = smf.ols(formula, data).fit()
    return model

可以用更好的方法解决,但重要的是时间。但是首先检查numba是否确实使任何奇怪的东西:

# With numba
sl ~ rk + yr + 1
0.835190760538

# Without numba
sl ~ rk + yr + 1
0.835190760538

因此,结果是相同的,现在让我们看看它们的性能如何:

# with numba
10 loops, best of 3: 264 ms per loop

# without numba
10 loops, best of 3: 252 ms per loop

因此,这完全符合我的预期。使用python类型并调用未绑定的外部函数,您不会获得任何速度提升。您可以使用numba使其速度更快,但请确保已通读numba文档并查看受支持的内容:Python类型Numpy类型



所属网站分类: 技术文章 > 问答

作者:黑洞官方问答小能手

链接:https://www.pythonheidong.com/blog/article/226451/d8e68e3c82d9862133d3/

来源:python黑洞网

任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任

27 0
收藏该文
已收藏

评论内容:(最多支持255个字符)